Skip to content

Commit 6a20ee4

Browse files
authored
Merge pull request #2510 from sparklemotion/flavorjones-encoding-reader-performance-v1.13.x
improve encoding reader performance (backport to v1.13.x)
2 parents b848031 + e444525 commit 6a20ee4

File tree

2 files changed

+13
-1
lines changed

2 files changed

+13
-1
lines changed

lib/nokogiri/html4/document.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -268,7 +268,7 @@ def start_element(name, attrs = [])
268268
end
269269

270270
def self.detect_encoding(chunk)
271-
(m = chunk.match(/\A(<\?xml[ \t\r\n]+[^>]*>)/)) &&
271+
(m = chunk.match(/\A(<\?xml[ \t\r\n][^>]*>)/)) &&
272272
(return Nokogiri.XML(m[1]).encoding)
273273

274274
if Nokogiri.jruby?

test/html4/test_document_encoding.rb

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,18 @@ def binopen(file)
155155
end
156156
end
157157
end
158+
159+
it "does not start backtracking during detection of XHTML encoding" do
160+
# this test is a quick and dirty version
161+
# of the more complete perf test that is on main.
162+
n = 40_000
163+
redos_string = "<?xml " + (" " * n)
164+
redos_string.encode!("ASCII-8BIT")
165+
start_time = Time.now
166+
Nokogiri::HTML4(redos_string)
167+
elapsed_time = Time.now - start_time
168+
assert_operator(elapsed_time, :<, 1)
169+
end
158170
end
159171
end
160172
end

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy