Skip to content

xlsx: use TextDecoder and TextEncoder in browser #1486

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 5, 2020

Conversation

myfreeer
Copy link
Contributor

@myfreeer myfreeer commented Oct 3, 2020

Summary

Doing a profiling in chrome dev tools shows that the Buffer.toString() and Buffer.from(string) is using unexpected long cpu time. With the native TextDecoder and TextEncoder it can get much faster in browsers supporting it.
On browsers not supporting TextDecoder, like Internet Explorer, this would fallback to original Buffer.toString() and Buffer.from(string).
This implements almost the same of #1458 in a non monkey-patching way covering xlsx only.
Closes #1458

References:
feross/buffer#268
feross/buffer#60
https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder
https://developer.mozilla.org/en-US/docs/Web/API/TextEncoder

Test plan

Profiling below is based on reading spec/integration/data/huge.xlsx.

Profiling result before this

before
Profile-before.json.gz

Profiling result after this

after
Profile-after.json.gz

});
let content;
// https://www.npmjs.com/package/process
if (process.browser) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be if (!process.browser) or am I misunderstanding the intention? process.browser is truthy only in browser environments, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. I'll rebase and fix that.

@alubbe
Copy link
Member

alubbe commented Oct 5, 2020

Could you please run our benchmark on this branch vs master on node and post the numbers? Node also has TextDecoder, so I would like to know that this PR doesn't make anything slower on the server-side

Doing a profiling in chrome dev tools shows that the `Buffer.toString()` and `Buffer.from(string)` is using unexpected long cpu time. With the native TextDecoder and TextEncoder it can get much faster in browsers supporting it.
On browsers not supporting TextDecoder, like Internet Explorer, this would fallback to original `Buffer.toString()` and `Buffer.from(string)`.
This implements almost the same of exceljs#1458 in a non monkey-patching way covering xlsx only.
Closes exceljs#1458

References:
feross/buffer#268
feross/buffer#60
https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder
https://developer.mozilla.org/en-US/docs/Web/API/TextEncoder
@myfreeer
Copy link
Contributor Author

myfreeer commented Oct 5, 2020

Env: Windows 10 x64, node.js 12.18.4 LTS x64

npm run benchmark on master(834e893)
> exceljs@4.1.1 benchmark D:\UserData\projects\exceljs
> node --expose-gc benchmark


####################################################
WARMUP: Current memory usage: 8.89 MB
WARMUP: huge xlsx file streams profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
WARMUP: huge xlsx file streams profiling finished in 8468ms
WARMUP: Current memory usage (before GC): 153.18 MB
WARMUP: Current memory usage (after GC): 41.61 MB

####################################################
RUN 1: huge xlsx file streams profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 1: huge xlsx file streams profiling finished in 8361ms
RUN 1: Current memory usage (before GC): 116.55 MB
RUN 1: Current memory usage (after GC): 25.65 MB

####################################################
RUN 2: huge xlsx file streams profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 2: huge xlsx file streams profiling finished in 7949ms
RUN 2: Current memory usage (before GC): 92.59 MB
RUN 2: Current memory usage (after GC): 25.53 MB

####################################################
RUN 3: huge xlsx file streams profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 3: huge xlsx file streams profiling finished in 8053ms
RUN 3: Current memory usage (before GC): 150.06 MB
RUN 3: Current memory usage (after GC): 41.58 MB

####################################################
WARMUP: Current memory usage: 41.27 MB
WARMUP: huge xlsx file async iteration profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
WARMUP: huge xlsx file async iteration profiling finished in 8182ms
WARMUP: Current memory usage (before GC): 110.96 MB
WARMUP: Current memory usage (after GC): 25.59 MB

####################################################
RUN 1: huge xlsx file async iteration profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 1: huge xlsx file async iteration profiling finished in 7866ms
RUN 1: Current memory usage (before GC): 104.75 MB
RUN 1: Current memory usage (after GC): 25.54 MB

####################################################
RUN 2: huge xlsx file async iteration profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 2: huge xlsx file async iteration profiling finished in 8086ms
RUN 2: Current memory usage (before GC): 147.65 MB
RUN 2: Current memory usage (after GC): 41.64 MB

####################################################
RUN 3: huge xlsx file async iteration profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 3: huge xlsx file async iteration profiling finished in 8112ms
RUN 3: Current memory usage (before GC): 108.77 MB
RUN 3: Current memory usage (after GC): 25.58 MB

npm run benchmark on myfreeer:xlsx-text-browser(d29517c)

> exceljs@4.1.1 benchmark D:\UserData\projects\exceljs
> node --expose-gc benchmark


####################################################
WARMUP: Current memory usage: 8.91 MB
WARMUP: huge xlsx file streams profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
WARMUP: huge xlsx file streams profiling finished in 8136ms
WARMUP: Current memory usage (before GC): 45.02 MB
WARMUP: Current memory usage (after GC): 11.33 MB

####################################################
RUN 1: huge xlsx file streams profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 1: huge xlsx file streams profiling finished in 8196ms
RUN 1: Current memory usage (before GC): 44.78 MB
RUN 1: Current memory usage (after GC): 11.44 MB

####################################################
RUN 2: huge xlsx file streams profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 2: huge xlsx file streams profiling finished in 7972ms
RUN 2: Current memory usage (before GC): 49.93 MB
RUN 2: Current memory usage (after GC): 13.49 MB

####################################################
RUN 3: huge xlsx file streams profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 3: huge xlsx file streams profiling finished in 8199ms
RUN 3: Current memory usage (before GC): 146.28 MB
RUN 3: Current memory usage (after GC): 41.58 MB

####################################################
WARMUP: Current memory usage: 41.28 MB
WARMUP: huge xlsx file async iteration profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
WARMUP: huge xlsx file async iteration profiling finished in 8278ms
WARMUP: Current memory usage (before GC): 113.98 MB
WARMUP: Current memory usage (after GC): 25.63 MB

####################################################
RUN 1: huge xlsx file async iteration profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 1: huge xlsx file async iteration profiling finished in 8062ms
RUN 1: Current memory usage (before GC): 64.78 MB
RUN 1: Current memory usage (after GC): 17.59 MB

####################################################
RUN 2: huge xlsx file async iteration profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 2: huge xlsx file async iteration profiling finished in 8105ms
RUN 2: Current memory usage (before GC): 96.73 MB
RUN 2: Current memory usage (after GC): 25.62 MB

####################################################
RUN 3: huge xlsx file async iteration profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 3: huge xlsx file async iteration profiling finished in 8426ms
RUN 3: Current memory usage (before GC): 102.55 MB
RUN 3: Current memory usage (after GC): 25.63 MB


Copy link
Member

@alubbe alubbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!

@alubbe alubbe merged commit 819b09c into exceljs:master Oct 5, 2020
@myfreeer myfreeer deleted the xlsx-text-browser branch October 5, 2020 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy