Skip to content

[DOC] Tweaks for String#dump #13883

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions doc/string.rb
Original file line number Diff line number Diff line change
Expand Up @@ -458,8 +458,7 @@
#
# _Substitution_
#
# - #dump: Returns a copy of +self+ with all non-printing characters replaced by \xHH notation
# and all special characters escaped.
# - #dump: Returns a printable version of +self+, enclosed in double-quotes.
# - #undump: Returns a copy of +self+ with all <tt>\xNN</tt> notations replaced by <tt>\uNNNN</tt> notations
# and all escaped characters unescaped.
# - #sub: Returns a copy of +self+ with the first substring matching a given pattern
Expand Down
45 changes: 45 additions & 0 deletions doc/string/dump.rdoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
Returns a printable version of +self+, enclosed in double-quotes:

'hello'.dump # => "\"hello\""
'тест'.dump # => "\"\\u0442\\u0435\\u0441\\u0442\""
'こんにちは'.dump # => "\"\\u3053\\u3093\\u306B\\u3061\\u306F\""

The format of the result depends on the encoding of the string:

s = 'hello'
s.encoding # => #<Encoding:UTF-8>
s.dump # => "\"hello\""
s.encode('utf-16').dump # => "\"\\xFE\\xFF\\x00h\\x00e\\x00l\\x00l\\x00o\".dup.force_encoding(\"UTF-16\")"
s.encode('utf-16le').dump # => "\"h\\x00e\\x00l\\x00l\\x00o\\x00\".dup.force_encoding(\"UTF-16LE\")"

s = 'тест'
s.encoding # => #<Encoding:UTF-8>
s.dump # => "\"\\u0442\\u0435\\u0441\\u0442\""
s.encode('utf-16').dump # => "\"\\xFE\\xFF\\x04B\\x045\\x04A\\x04B\".dup.force_encoding(\"UTF-16\")"
s.encode('utf-16le').dump # => "\"B\\x045\\x04A\\x04B\\x04\".dup.force_encoding(\"UTF-16LE\")"

s = 'こんにちは'
s.encoding # => #<Encoding:UTF-8>
s.dump # => "\"\\u3053\\u3093\\u306B\\u3061\\u306F\""
s.encode('utf-16').dump # => "\"\\xFE\\xFF0S0\\x930k0a0o\".dup.force_encoding(\"UTF-16\")"
s.encode('utf-16le').dump # => "\"S0\\x930k0a0o0\".dup.force_encoding(\"UTF-16LE\")"
Comment on lines +27 to +43
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to move the examples of non-UTF8 encodings to a separate section with some text describing it (e.g. using hexadecimal format and adding dup.force_encoding(<encoding name>). This is because non-UTF8 is more of an edge case rather than a commonly used case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved the cited lines to the end. I think you want other changes, but I'm not sure what exactly is needed. Can you fix up one, as a guide for me?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterzhu2118, I'll take another shot at this; marking as Draft in the interim.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterzhu2118, I take it back. I don't know what to do with this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #13965


Certain special characters are rendered with escapes:

'"'.dump # => "\"\\\"\""
'\\'.dump # => "\"\\\\\""

Non-printing characters are rendered with escapes:

s = ''
s << 7 # Alarm (bell).
s << 8 # Back space.
s << 9 # Horizontal tab.
s << 10 # Line feed.
s << 11 # Vertical tab.
s << 12 # Form feed.
s << 13 # Carriage return.
s # => "\a\b\t\n\v\f\r"
s.dump # => "\"\\a\\b\\t\\n\\v\\f\\r\""

Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String].
11 changes: 2 additions & 9 deletions string.c
Original file line number Diff line number Diff line change
Expand Up @@ -7412,16 +7412,9 @@ rb_str_inspect(VALUE str)

/*
* call-seq:
* dump -> string
* dump -> new_string
*
* Returns a printable version of +self+, enclosed in double-quotes,
* with special characters escaped, and with non-printing characters
* replaced by hexadecimal notation:
*
* "hello \n ''".dump # => "\"hello \\n ''\""
* "\f\x00\xff\\\"".dump # => "\"\\f\\x00\\xFF\\\\\\\"\""
*
* Related: String#undump (inverse of String#dump).
* :include: doc/string/dump.rdoc
*
*/

Expand Down
Loading
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy