0% found this document useful (0 votes)
34 views9 pages

XML - Extensible Markup Language (TMS) - Phrase

The document provides instructions for importing .XML files into Phrase TMS for machine translation. It explains that .XML files require additional import settings due to their format. It outlines default and optional import settings for elements, attributes, inline elements, and more. It also provides guidance for importing multilingual .XML files containing source and target text segments.

Uploaded by

Zhihao Wang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views9 pages

XML - Extensible Markup Language (TMS) - Phrase

The document provides instructions for importing .XML files into Phrase TMS for machine translation. It explains that .XML files require additional import settings due to their format. It outlines default and optional import settings for elements, attributes, inline elements, and more. It also provides guidance for importing multilingual .XML files containing source and target text segments.

Uploaded by

Zhihao Wang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

File Import Settings

 Enter keywords

Phrase TMS Phrase Strings Phrase Orchestrator Global Release notes Release Notes Archive

Phrase TMS / File Import Settings

.XML - Extensible Markup Language (TMS)


Article content is machine translated from English to other languages by Phrase Translate.

The .XML file format is not designed for translation and requires additional settings for successful
import.

Default settings are marked with an asterisk (*) and will import all XML elements for translation.
Import options can be used to change the import behavior.

File Types
.XML

Import Options
Plain import rules

Elements

Only selected elements (i.e. name, title, para) are imported. An asterisk (*) imports all elements.

Attributes
Only selected attributes (i.e. name, title, para) are imported. An asterisk (*) imports all attributes.

Translatable inline elements

If the Identify inline elements automatically option is selected, all elements in the translatable text
are imported as Translatable inline elements.

Non-translatable inline elements

Selected inline element name, title, para will be converted into tags and content will not be
translatable.

Identify inline elements automatically

Elements that are neighbors of text nodes will be automatically converted to inline tags.

Elements (processed as HTML)

Selected element code is processed as .HTML. HTML Import Settings  such as Preserve
Whitespaces or Break tag (<br/>) creates new segment can be used for these elements.

Locked elements

The selected elements will be imported as Locked.

Locked attributes

The selected attributes will be imported as Locked.

Parse ICU messages

ICU messages are automatically converted to tags. Files with ICU messages cannot contain any
inline elements.

Import XML entities

XML entities in DTP Declaration  will be imported for translation.

Segment XML

Deselect if segmentation is not desired.

Import comments

Comments are not imported if elements are processed as HTML as indicated in the Elements
(processed as HTML) option.

Convert to Phrase TMS tags

Apply regular expressions  to convert specified text to tags.

Convert to character entities


Enter a list of character references (separated by commas) into the output file.

Example:

If quotation marks (") are required, they would be represented as &quot; , the character Σ
would be represented as &#x3A3; use &quot;,&#x3A3; . & and < are always exported as
&amp; and &lt; respectively.

XML settings using XPath

Using the XPath query language  allows for the creation of complex import rules and some
additional features unavailable in plain import rules.

XPath expression should define the elements and/or attributes whose text/value should be
translated and not the actual text node.

Familiarity with XPath is recommended before using.

Context note, Context key, and Max. target length will not be processed for files with more than
10,000 XML elements.

Context key

Constitutes TM context (101% matches) if applicable.

Context note

Import elements or context attributes for each element.

Max. target length

Import elements or the maximum target length for each element. The character limit for each
segment is displayed on the Context note pane inside the editor. Any character exceeding the
limit is highlighted in red.

Preserve whitespaces

Keep empty to preserve whitespaces in elements. Apply xml:whitespace='preserve'. //* to preserve


all whitespaces in all elements, or use an arbitrary XPath expression.

HTML preview with XSLT stylesheet


XSLT language (Extensible Stylesheet Language Transformations) can be used to transform .XML
documents into .HTML format for in-context preview  purposes. Accordingly, preview files
downloaded via Preview translation in the Document menu come with HTML extension. Phrase
currently supports XSLT 2.0.

Click Choose file to import a stylesheet.

Click Download XSLT to download the stylesheet after file import.

CDATA in XML file


CDATA means Character Data and is defined as blocks of text that are not processed by the parser
but are recognized as markup. Predefined entities such as &lt;, &gt; , and &amp; require typing
and are generally difficult to read in the markup. In such cases, the CDATA section can be used.

If CDATA contains embedded .HTML, the corresponding XML elements should be listed under
Elements (processed as HTML).

If the source file contains CDATA and the Segment XML is used then CDATA is added to every
segment in the Completed file.

CDATA will only be segmented if there is a clear indication of a segment break such as punctuation
or spacing.

Source:

<text><![CDATA[Translatable text A. Translatable text B.]]></text>

Target:

<text><![CDATA[Translatable text A.]]><![CDATA[ ]]><![CDATA[Translatable text B.]]></text>

The Completed file is valid .XML and the XML viewer will display the text correctly as Translatable
text A. Translatable text B.

Application Specific Settings 

Multilingual XML
Multilingual files are imported as multiple bilingual jobs with languages mapped before import. They
are represented with

in the jobs table . If imported into several target languages, the Completed
file is composed of all target languages.

Phrase supports XML files that have both source and target elements present for all paragraphs even
if the target is empty. When the source and target segmentation are different, the source
segmentation is determining.

Individual language elements must all be descendants of the same trans-unit element and one
language cannot be contained within the other. Source and target content cannot be stored in
attribute values. If multiple elements match the XPath for source or target inside the trans-unit
element, only the first one is imported for translation.

When creating a job , select Multilingual XML from the File Type pane before applying Import
Options. If not specified, the file will be imported as standard .XML.

Tag content of source .XML file can be visualized in the editor by clicking Expand tags under the
Tool menu and edited by clicking F2 .

Example:

Sample of partially translated text from English to German and French. All <tuv lang="en"> ,
<tuv lang="de"> and <tuv lang="fr"> are children of the same <tu> element.

<?xml version="1.0" encoding="utf-8"?>

<root>

Not translatable text.

<tu note="context note" key="ID 254" maxlen="16">

<tuv lang="en">

<seg>First segment.</seg>

</tuv>

<tuv lang="de">

<seg>Erste segment</seg>

</tuv>

<tuv lang="fr">

<seg></seg>

</tuv>

</tu>
<tu note="another context note" key="ID 255" maxlen="18">

<tuv lang="en">

<seg>Second segment.</seg>

</tuv>

<tuv lang="de">

<seg></seg>

</tuv>

<tuv lang="fr">

<seg></seg>

</tuv>

</tu>

</root>

Import Options

For the import of Multilingual .XML files, the XPath  query language must be used. See example
above for reference. The XPath expression defines the elements in which the text/value should be
translated and not the actual text node.

Elements containing source and target sub-elements

//tu

Elements containing source text

tuv[@lang='en']/seg (in relation to the parent element //tu )

Elements containing target text

tuv[@lang='de']/seg (in relation to the paContext note rent element //tu )

Elements containing target text

tuv[@lang='fr']/seg (in relation to the parent element //tu )

Non-translatable inline elements

All elements in source or target are considered Translatable inline elements unless specified here
as Non-translatable inline elements.

Convert to Phrase TMS tags

Apply regular expressions  to convert specified text to tags.


Context key

Specify a context key that is saved with the segment to the translation memory  and used for
match context.

Context note

Import elements or context attributes for each element.

Max. target length

Import elements or the maximum target length for each element

Convert to character entities

Enter a list of character references (separated by commas) into the output file.

Example:

If quotation marks (") are required, they would be represented as &quot; , the character Σ
would be represented as &#x3A3; use &quot;,&#x3A3; . & and < are always exported as
&amp; and &lt; respectively.

Parse ICU messages

ICU messages are automatically converted to tags. Files with ICU messages cannot contain any
inline elements.

Use HTML subfilter

Imports HTML tags contained in the file. Tags can then be used with HTML File Import Settings.
Paragraph tags <p> will create new segments even if Segment Multilingual XML is unselected.

Segment multilingual XML

Text is segmented by a general segmentation rule  rather than one segment per cell.

Caution
Applying Segment multilingual XML to a file that contains target text may result in a different
number of segments in the source than in the target.

Set segment status of non-empty target

Select default confirmation status and whether confirmed segments are automatically added to
TM.
Example:
If a multilingual .XML contains namespace, the XPath could be the following: 
Elements containing source and target sub-elements

//*[local-name()='trans-unit']

Elements containing source text

*[local-name()='source']

Elements containing target text

*[local-name()='target']

Was this article helpful?

 Yes  No

Recently viewed articles

Modify Term Bases (TMS)

In this article

File Types
Import Options
HTML preview with XSLT stylesheet
CDATA in XML file
Application Specific Settings
Multilingual XML
Import Options

  

RESOURCES
Free Trial
Download Editor 
Blog
Release Notes
Feature requests
Webinars

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy