xlsx: use TextDecoder and TextEncoder in browser #1486

myfreeer · 2020-10-03T04:34:07Z

Summary

Doing a profiling in chrome dev tools shows that the Buffer.toString() and Buffer.from(string) is using unexpected long cpu time. With the native TextDecoder and TextEncoder it can get much faster in browsers supporting it.
On browsers not supporting TextDecoder, like Internet Explorer, this would fallback to original Buffer.toString() and Buffer.from(string).
This implements almost the same of #1458 in a non monkey-patching way covering xlsx only.
Closes #1458

References:
feross/buffer#268
feross/buffer#60
https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder
https://developer.mozilla.org/en-US/docs/Web/API/TextEncoder

Test plan

Profiling below is based on reading spec/integration/data/huge.xlsx.

Profiling result before this

Profile-before.json.gz

Profiling result after this

Profile-after.json.gz

alubbe · 2020-10-05T06:55:11Z

lib/xlsx/xlsx.js

+          });
+          let content;
+          // https://www.npmjs.com/package/process
+          if (process.browser) {


should this be if (!process.browser) or am I misunderstanding the intention? process.browser is truthy only in browser environments, right?

Yes, you are right. I'll rebase and fix that.

alubbe · 2020-10-05T07:41:48Z

Could you please run our benchmark on this branch vs master on node and post the numbers? Node also has TextDecoder, so I would like to know that this PR doesn't make anything slower on the server-side

Doing a profiling in chrome dev tools shows that the `Buffer.toString()` and `Buffer.from(string)` is using unexpected long cpu time. With the native TextDecoder and TextEncoder it can get much faster in browsers supporting it. On browsers not supporting TextDecoder, like Internet Explorer, this would fallback to original `Buffer.toString()` and `Buffer.from(string)`. This implements almost the same of exceljs#1458 in a non monkey-patching way covering xlsx only. Closes exceljs#1458 References: feross/buffer#268 feross/buffer#60 https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder https://developer.mozilla.org/en-US/docs/Web/API/TextEncoder

myfreeer · 2020-10-05T07:56:31Z

Env: Windows 10 x64, node.js 12.18.4 LTS x64

npm run benchmark on master(834e893)

> exceljs@4.1.1 benchmark D:\UserData\projects\exceljs
> node --expose-gc benchmark


####################################################
WARMUP: Current memory usage: 8.89 MB
WARMUP: huge xlsx file streams profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
WARMUP: huge xlsx file streams profiling finished in 8468ms
WARMUP: Current memory usage (before GC): 153.18 MB
WARMUP: Current memory usage (after GC): 41.61 MB

####################################################
RUN 1: huge xlsx file streams profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 1: huge xlsx file streams profiling finished in 8361ms
RUN 1: Current memory usage (before GC): 116.55 MB
RUN 1: Current memory usage (after GC): 25.65 MB

####################################################
RUN 2: huge xlsx file streams profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 2: huge xlsx file streams profiling finished in 7949ms
RUN 2: Current memory usage (before GC): 92.59 MB
RUN 2: Current memory usage (after GC): 25.53 MB

####################################################
RUN 3: huge xlsx file streams profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 3: huge xlsx file streams profiling finished in 8053ms
RUN 3: Current memory usage (before GC): 150.06 MB
RUN 3: Current memory usage (after GC): 41.58 MB

####################################################
WARMUP: Current memory usage: 41.27 MB
WARMUP: huge xlsx file async iteration profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
WARMUP: huge xlsx file async iteration profiling finished in 8182ms
WARMUP: Current memory usage (before GC): 110.96 MB
WARMUP: Current memory usage (after GC): 25.59 MB

####################################################
RUN 1: huge xlsx file async iteration profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 1: huge xlsx file async iteration profiling finished in 7866ms
RUN 1: Current memory usage (before GC): 104.75 MB
RUN 1: Current memory usage (after GC): 25.54 MB

####################################################
RUN 2: huge xlsx file async iteration profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 2: huge xlsx file async iteration profiling finished in 8086ms
RUN 2: Current memory usage (before GC): 147.65 MB
RUN 2: Current memory usage (after GC): 41.64 MB

####################################################
RUN 3: huge xlsx file async iteration profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 3: huge xlsx file async iteration profiling finished in 8112ms
RUN 3: Current memory usage (before GC): 108.77 MB
RUN 3: Current memory usage (after GC): 25.58 MB

npm run benchmark on myfreeer:xlsx-text-browser(d29517c)


> exceljs@4.1.1 benchmark D:\UserData\projects\exceljs
> node --expose-gc benchmark


####################################################
WARMUP: Current memory usage: 8.91 MB
WARMUP: huge xlsx file streams profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
WARMUP: huge xlsx file streams profiling finished in 8136ms
WARMUP: Current memory usage (before GC): 45.02 MB
WARMUP: Current memory usage (after GC): 11.33 MB

####################################################
RUN 1: huge xlsx file streams profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 1: huge xlsx file streams profiling finished in 8196ms
RUN 1: Current memory usage (before GC): 44.78 MB
RUN 1: Current memory usage (after GC): 11.44 MB

####################################################
RUN 2: huge xlsx file streams profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 2: huge xlsx file streams profiling finished in 7972ms
RUN 2: Current memory usage (before GC): 49.93 MB
RUN 2: Current memory usage (after GC): 13.49 MB

####################################################
RUN 3: huge xlsx file streams profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 3: huge xlsx file streams profiling finished in 8199ms
RUN 3: Current memory usage (before GC): 146.28 MB
RUN 3: Current memory usage (after GC): 41.58 MB

####################################################
WARMUP: Current memory usage: 41.28 MB
WARMUP: huge xlsx file async iteration profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
WARMUP: huge xlsx file async iteration profiling finished in 8278ms
WARMUP: Current memory usage (before GC): 113.98 MB
WARMUP: Current memory usage (after GC): 25.63 MB

####################################################
RUN 1: huge xlsx file async iteration profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 1: huge xlsx file async iteration profiling finished in 8062ms
RUN 1: Current memory usage (before GC): 64.78 MB
RUN 1: Current memory usage (after GC): 17.59 MB

####################################################
RUN 2: huge xlsx file async iteration profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 2: huge xlsx file async iteration profiling finished in 8105ms
RUN 2: Current memory usage (before GC): 96.73 MB
RUN 2: Current memory usage (after GC): 25.62 MB

####################################################
RUN 3: huge xlsx file async iteration profiling started
Reading worksheet 1
Reading row 50000
Reading row 100000
Reading worksheet 2
Reading row 150000
Processed 2 worksheets and 150002 rows
RUN 3: huge xlsx file async iteration profiling finished in 8426ms
RUN 3: Current memory usage (before GC): 102.55 MB
RUN 3: Current memory usage (after GC): 25.63 MB

alubbe

thank you!

myfreeer force-pushed the xlsx-text-browser branch from f4153d4 to 8eb2ccd Compare October 3, 2020 04:42

alubbe reviewed Oct 5, 2020

View reviewed changes

myfreeer force-pushed the xlsx-text-browser branch from 8eb2ccd to 99f48c3 Compare October 5, 2020 07:24

myfreeer force-pushed the xlsx-text-browser branch from 99f48c3 to d29517c Compare October 5, 2020 07:49

alubbe approved these changes Oct 5, 2020

View reviewed changes

alubbe merged commit 819b09c into exceljs:master Oct 5, 2020

myfreeer deleted the xlsx-text-browser branch October 5, 2020 10:27

DhavalW mentioned this pull request Nov 30, 2023

[Snyk] Security upgrade exceljs from 0.2.7 to 4.2.0 DhavalW/exceljs#21

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

xlsx: use TextDecoder and TextEncoder in browser #1486

xlsx: use TextDecoder and TextEncoder in browser #1486

Uh oh!

myfreeer commented Oct 3, 2020 •

edited

Loading

Uh oh!

alubbe Oct 5, 2020

Uh oh!

myfreeer Oct 5, 2020

Uh oh!

alubbe commented Oct 5, 2020

Uh oh!

myfreeer commented Oct 5, 2020

Uh oh!

alubbe left a comment

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

xlsx: use TextDecoder and TextEncoder in browser #1486

xlsx: use TextDecoder and TextEncoder in browser #1486

Uh oh!

Conversation

myfreeer commented Oct 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

alubbe Oct 5, 2020

Choose a reason for hiding this comment

Uh oh!

myfreeer Oct 5, 2020

Choose a reason for hiding this comment

Uh oh!

alubbe commented Oct 5, 2020

Uh oh!

myfreeer commented Oct 5, 2020

Uh oh!

alubbe left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

myfreeer commented Oct 3, 2020 •

edited

Loading