Gcsort Manual
Gcsort Manual
06
[15 GEN 2015 Version]
User’s Guide
1nd Edition, 15 Janury 2016
Sauro Menna
mennasauro@gmail.com
2. Features
_______________________________________________________________________________________
gcsort help
gcsort is a program to sort, merge and copy records in a file into a specified order
________________________________________________________________________________________
Syntax case insensitive
Return code : 0 (ok) - 4 (warning) - 16 (error)
________________________________________________________________________________________
Usage with file parameters : gcsort <options> take filename
Usage from command line : gcsort <options> <control statements>
________________________________________________________________________________________
gcsort options
-fsign=[ASCII|EBCDIC] define display sign representation
-fcolseq=[NATIVE|ASCII|EBCDIC] collating sequence to use
-febcdic-table=<cconv-table>/<file> EBCDIC/ASCII translation table
________________________________________________________________________________________
gcsort control statements
Notations: '{name}' = parameters , '|' = Alternative format of control statement
________________________________________________________________________________________
SORT | MERGE | COPY FIELDS Control statement for Sort or Merge file(s)
USE Declare input file(s)
GIVE Declare output file
[ SUM FIELDS ] Sum fields for same record key, or eliminate duplicate keys)
[ RECORD ] Record control statement
[ INCLUDE ] Select input records that respect include condition(s)
[ OMIT ] Omit input records that respect include condition(s)
[ INREC ] Reformat input record Before sort, merge or copy operation
[ OUTREC ] Reformat input record After sort, merge or copy operation
[ OUTFIL ] Create one or more output files for sort,merge or copy operation
[ OPTION ] Specifies option for control statements
________________________________________________________________________________________
gcsort
SORT | MERGE | COPY
USE {Filename}
ORG {Org}
RECORD [F,{RecordLen}] | [V,{MinLen},{MaxLen}]
[KEY ({Pos},{Len},{KeyType})]
OUTFIL
INCLUDE | OMIT ({Condition})[,FORMAT={FormatType}]
OUTREC = ({FieldSpec})
FILES/FNAMES= {Filename} | (file1, file2, file3,...)
STARTREC={nn} Start from record nn
ENDREC={nn} Skip record after nn
SAVE
SPLIT Split 1 record output for file group (file1, file2, file3,...)
SPLITBY={nn} Split n records output for file group (file1, file2, file3,...)
OPTION
SKIPREC={nn} Skip nn records from input
STOPAFT={nn} Stop read after nn records
VLSCMP 0 disabled , 1 = enabled -- temporarily replace any
missing compare field bytes with binary zeros
VLSHRT 0 disabled , 1 = enabled -- treat any comparison
involving a short field as false
Y2PAST (YY) - Sliding, (YYYY) century
MODS E15=(<name>) [,] <name>= Name E15 Cobol Program for input
E35=(<name>) <name>= Name E35 Cobol Program for ouput
________________________________________________________________________________________
___{Parameters}____________________________|___{Relational}_____________________________
{FileName} = Filename or Env. Variable | EQ = Equal
{Pos} = Field Position | GT = GreaterThan
{Len} = Field Length | GE = GreaterEqual
{RecordLen}= Record Length | LT = LesserThan
{MinLen} = Min size of record | LE = LesserEqual
- In the file TAKE the ‘*’ character indicates that the rest of the line is treated as a comment
export LD_LIBRARY_PATH=/usr/local/lib
export GCSORT_MEMSIZE=1024000000
export GCSORT_BYTEORDER=0
export GCSORT_STATISTICS=2
echo " * This is comment " >TAKEFILE.PRM
echo "SORT FIELDS(4,1,CH,A) " >TAKEFILE.PRM
echo "SUM FIELDS=(1,2,ZD,4,2,ZD,7,4,ZD,12,4,ZD) " >>TAKEFILE.PRM
echo "USE ../files/SQZD03 RECORD F,396 ORG SQ " >>TAKEFILE.PRM
echo "GIVE ../files/SQZD03.SRT RECORD F,396 ORG SQ " >>TAKEFILE.PRM
../bin/gcsort TAKE TAKEFILE.PRM
This picture show logical schema of utility GCSort for SORT operations.
This picture show logical schema of utility GCSort for MERGE operations.
3. Merge
The purpose of MERGE is read one or more files and create a output file with data ordered as indicated by
the merge key fields.
It is mandatory that the input data is already sorted.
LS = Line Sequential
LSF = Line Sequential Fixed
SQ = Sequential
IX = Indexed
RL = Relative
5. Field Type
Field type detects typology of field, Field type used are:
Type Description
CH Char
BI Binary unsigned
FI Binary signed
FL Floating Point
PD Packed
ZD Zoned
CLO Numeric sign leading
CSL Numeric sign leading separate
CST Numeric sign trailing separate
SS Search Substring
[ DATE4 ]
OMIT COND=(1,13,CH,GT,DATE4)
6.1. SORT
SORT is command for ordering data.
Format 1 SORT
6.2.MERGE
MERGE is command for merging data.
Format 1 MERGE
6.3.COPY
In SORT or MERGE command FIELDS=COPY copy data from input to output file.
Format 1 FIELDS=COPY
6.4.FIELDS
This command specify fields for sort/merge operations. The fields are the key for order or merging data
from files.
pos specifies the first byte of a control field relative to the beginning of the input record.
The first data byte of a fixed-length record has relative position 1.
The first data byte of a variable-length record has relative position 1.
len specifies the length of the field. Values for all fields must be expressed in integer numbers
of bytes.
type specifies the format of the data of field.
Type Description
CH Char
BI Binary unsigned
FI Binary signed
FL Floating Point
PD Packed
ZD Zoned
15 Jan 2016 Version Pag. 15
CLO Numeric sign leading
CSL Numeric sign leading separate
CST Numeric sign trailing separate
SS Search Substring
order specifies how the field is to be ordered. The valid codes are:
A ascending order
D descending order
FORMAT=type can be used to specify a particular format for one or more control fields. f from FORMAT=f is
used for p,m,s fields.
FIELDS=COPY or FIELDS=(COPY)
Causes GCSORT to copy a file input to the output data sets. Records can be edited INCLUDE/OMIT, INREC,
OUTREC, and OUTFIL statements; and SKIPREC and STOPAFT parameters.
6.5.USE
USE command declare input file for SORT and MERGE operation.
USE <filename > ORG <organization> RECORD [<record format>, <lenght min>,< length max>]
[KEY ({Pos},{Len},{KeyType})
6.6.GIVE
GIVE command declare output file for SORT and MERGE operation.
GIVE <filename > ORG <organization> RECORD [<record format>, <lenght min>,< length max>]
[KEY ({Pos},{Len},{KeyType})
6.7.INCLUDE/OMIT
INCLUDE condition statement is used for select records to insert in the file output.
OMIT condition statement is used for exclude certain records from the file input.
condition
Format 1 (pos , len , type , cond, pos , len , type)
Format 2 (pos , len , type , cond, [X|C|Z]'[value]')
Format 3 (condition , relcond , condition)
pos specifies the first byte of a control field relative to the beginning of the input record.
The first data byte of a fixed-length record has relative position 1.
The first data byte of a variable-length record has relative position 1.
len specifies the length of the field. Values for all fields must be expressed in integer numbers of
bytes.
type specifies the format of the data of field.
Type Description
CH Char
BI Binary unsigned
FI Binary signed
With the SearchSubstring option, you can search for substrings within a field. The length can be greater
than the length of the substring. It is possible to search for multiple substrings within the field.
Examples:
INCLUDE COND=(1,100,SS,EQ,C'66666')
INCLUDE FORMAT=SS,COND=(18,2,EQ,C'00,88,99')
pos specifies the first byte of a control field relative to the beginning of the input record.
The first data byte of a fixed-length record has relative position 1.
The first data byte of a variable-length record has relative position 1.
len specifies the length of the field. Values for all fields must be expressed in integer numbers of
bytes.
type specifies the format of the data of field.
Type Description
CH Char
BI Binary unsigned
FI Binary signed
FL Floating Point
PD Packed
ZD Zoned
CLO Numeric sign leading
CSL Numeric sign leading separate
CST Numeric sign trailing separate
SS Search Substring
X’hh..hh’ Hexadecimal String Format. The value hh represents any pair of hexadecimal digits.
CHANGE Specifies how the input field or parsed input field is to be changed to the output field, using a
lookup table.
NOMATCH if an input field value does not match any of the find constants, NOMATCH values is used for
output field.
CHANGE Specifies how the input field or parsed input field is to be changed to the output field, using
position(posFind) and length(lenFind) of input record.
NOMATCH if an input field value does not match any of the find constants, NOMATCH input record position
and length are used for output field.
6.8.INREC/OUTREC
INREC redefines the structure of record input. This operation is executed after read file input e before all
operations.
The INREC control statement reformat the input records before they are sorted, merged, or copied.
All fields specifications presents in OUTREC, Sort Key, … must be referred to a new structure defined by
INREC.
Use OVERALY only to overwrite existing columns or to add fields at end of every record.
_______________________________________________________________________________________
SUM FIELDS is command for aggregate record and summarize value for numeric fields.
All fields present in SUM FIELDS are aggregate when more records has same key.
There are two formats for SUM FIELD, the first summarize numeric fields, the send NOT summarize, but
eliminate duplicate key.
pos specifies the first byte of a control field relative to the beginning of the input record.
The first data byte of a fixed-length record has relative position 1.
The first data byte of a variable-length record has relative position 1.
len specifies the length of the field. Values for all fields must be expressed in integer numbers of bytes.
type specifies the format of the data of field.
Type Description
BI Binary unsigned
FI Binary signed
FL Floating Point
PD Packed
ZD Zoned
CLO Numeric sign leading
CSL Numeric sign leading separate
CST Numeric sign trailing separate
In this case Format2 insert into output file one occurrence of same key specified by SORT KEY.
The record output contains the first record in order of reading.
For identify a first occurrence of data, GCSORT verified the value of pointer of record into file input,
selecting the lowest value.
6.10. RECORD
RECORD control statement is option to specify the type and lengths of the records.
Example:
[ RECORD CONTROL STATEMENT ]
SORT FIELDS=(8,5,CH,A) USE ../files/sqbig01.dat ORG SQ GIVE ../files/sqbig01_gcs.srt ORG SQ RECORD TYPE=F, LENGTH=500
RECORD TYPE=F,LENGTH=(,,500)
6.11. OUTFIL
OUTFIL is command to create one or more output file for a sort, copy, or merge operation.
Each file output is defined from OUTFIL command
FORMAT
OUTFIL
FILES/FNAMES= (environment variable)
STARTREC=nn
ENDREC=nn
[SAVE|[INCLUDE|OMIT] (CONDITION) [FORMAT=TYPE]]
SPLIT
OUTREC = (FIELD-SPEC...)
OUTFIL
FILES/FNAMES=filename filename = Identify a environment variable the contain the file
name
STARTREC=nn Start write after nn records
ENDREC=nn Stop write after nn records
If the environment variable filename for FILES/FNAMES is not defined, GCSort writes output file in local
folder assuming the name equal at value of identifier filename (FILES/FNAMES=filename).
If OUTFIL does not include the definition of FNAMES/FILES the input data will be written to the GIVE file.
6.12. OPTION
Exit Routines
The purpose of the JOIN statement is to perform JOIN between two files (F1 and F2).
You can perform different types of join on two files (F1 and F2) by one or more keys with
GCSort using the following statements:
JOINKEYS
JOIN
Inner join – Default, only paired records from F1 and F2 are processed.
Left outer join - Unpaired F1 records as well as paired records.
Right outer join - Unpaired F2 records as well as paired records.
Full outer join - unpaired F1 and F2 records as well as paired records.
Unpaired F1,ONLY - Only unpaired F1 records
Unpaired F2,ONLY - Only unpaired F2 records
Unpaired F1,F2,ONLY / Unpaired,ONLY- Only unpaired F1 and F2 records
REFORMAT
OUTFIL
INCLUDE | OMIT ({Condition})[,FORMAT={FormatType}]
OUTREC BUILD | BUILD = ({FieldSpec})
FILES/FNAMES= {Filename}
________________________________________________________________________________________
___{Parameters}____________________________|___{Parameters}_____________________________
{File} = F1 or F2 | ? = 1-byte indicator joined record
{Pos} = Field Position | 'B' = 'Both' - Key found in F1 and F2
{Len} = Field Length | '1' = Key found in F1, but not in F2
{Order} = A(ascending) | D(descending)| '2' = Key found in F1, but not in F1
C'Constant'= Character fill byte | nn = Numbers of records from input file
X'hh' = Hexadecimal fill byte (00-FF). |
________________________________________________________________________________________
___{Parameters}____________________________|___{Relational}_____________________________
{FileName} = Filename or Env. Variable | EQ = Equal
{Pos} = Field Position | GT = GreaterThan
{Len} = Field Length | GE = GreaterEqual
{RecordLen}= Record Length | LT = LesserThan
{MinLen} = Min size of record | LE = LesserEqual
{MaxLen} = Max size of record | NE = NotEqual
{Order} = A(ascending) | D(descending)| SS = Substring (only for Field Type 'CH')
___________________________________________|____________________________________________
___{Condition}__________________________________________________________________________
Format 1 - (Pos,Len,{FormatType},{Relational},[AND|OR],Pos,Len,{FormatType})
Format 2 - (Pos,Len,{FormatType},{Relational},[X|C'[value]'] | numeric value)]
Format 3 - ( {Condition} ,[AND|OR],{Condition} )
Format 4 - ( Pos,Len,{FormatType},{Relational}, [DATE1][(+/-)num] | [DATE2][(+/-)num]
[DATE3][(+/-)num] | [DATE4][(+/-)num]
DATE - Currente Date : DATE1 (C'yyyymmdd'), DATE2 (C'yyyymm'),
DATE3 (C'yyyyddd'), DATE4 (C'yyyy-mm-dd') (no Timestamp)
[(+/-)num] [+num] future date, [-num] past date) only for DATE1,DATE2,DATE3
________________________________________________________________________________________
___{Org}___File Organization_______________|___{KeyType}____Mandatory for ORG = IX______
LS = Line Sequential | P = Primary Key
SQ = Sequential Fixed or Variable | A = Alternative Key
8.2.Temporary Files
When dimension of files input is greater of memory available, GCSort creates temporary files for sort
operation. Temporary files is created in pathname specified from GCSORT_TMPFILE environment variable, if
this value is not available, GCSort use TMP/TEMP environment variable or use current directory. For
Windows the filename is composed from:
- Prefix = Srt
- Name = name ( created from GetTempFileName())
- Extension = .tmp
-
For Linux file name is composed from:
- Prefix = Srt
- Name = PID of process GCSort
- Num = Progressive of file
- Extension = .tmp
Temporary files are destroyed after sort operation.
GCSort analyze the value and made two area for sort operation:
The optimization for use of memory GCSort check dimension of key and record.
(8 + 4 + 8) 8 is pointer of record into file, 4 record length, 8 pointer to record area in memory.
8.4.Statistics
0 = minimal information
Example:
========================================================
GCSort Version 01.00.00
========================================================
TAKE file name
D:\GNU_COBOL\GCSort_1_0_0\gcsort_testcase\take\par_SORT_debug.par
========================================================
File : D:\GCSORTTEST\OCFILES\TEST9\INP000.txt
Size : 1194
========================================================
Record Number Total : 15
Record Write Sort Total : 0
Record Write Output Total : 15
========================================================
Start : Mon Jan 25 11:17:55 2016
End : Mon Jan 25 11:17:55 2016
Elapsed Time 00hh 00mm 00ss 000ms
Sort OK
1 = medium information
Example
========================================================
GCSORT
File TAKE : D:\GNU_COBOL\GCSort_1_0_0\gcsort_testcase\take\par_SORT_debug.par
========================================================
SORT FIELDS(3,1,CH,A)
USE D:\GCSORTTEST\OCFILES\TEST9\INP000.txt ORG LS RECORD V,1,27990
GIVE D:\GCSORTTEST\OCFILES\TEST9\OUT000.SRT ORG LS RECORD V,1,27990
========================================================
GCSort Version 01.00.00
========================================================
TAKE file name
D:\GNU_COBOL\GCSort_1_0_0\gcsort_testcase\take\par_SORT_debug.par
========================================================
Operation : SORT
Sort OK
2 = details information
========================================================
GCSORT
File TAKE : D:\GNU_COBOL\GCSort_1_0_0\gcsort_testcase\take\par_SORT_debug.par
========================================================
SORT FIELDS(3,1,CH,A)
USE D:\GCSORTTEST\OCFILES\TEST9\INP000.txt ORG LS RECORD V,1,27990
GIVE D:\GCSORTTEST\OCFILES\TEST9\OUT000.SRT ORG LS RECORD V,1,27990
========================================================
GCSort Version 01.00.00
========================================================
TAKE file name
D:\GNU_COBOL\GCSort_1_0_0\gcsort_testcase\take\par_SORT_debug.par
========================================================
Operation : SORT
INPUT FILE :
D:\GCSORTTEST\OCFILES\TEST9\INP000.txt VARIABLE (1,27990) LS
OUTPUT FILE :
D:\GCSORTTEST\OCFILES\TEST9\OUT000.SRT VARIABLE (1,27990) LS
SORT FIELDS : (3,1,CH,A)
========================================================
File : D:\GCSORTTEST\OCFILES\TEST9\INP000.txt
Size : 1194
After job_loadFiles - Mon Jan 25 11:21:44 2016
After job_sort - Mon Jan 25 11:21:44 2016
After job_save - Mon Jan 25 11:21:44 2016
========================================================
Record Number Total : 15
Record Write Sort Total : 0
Record Write Output Total : 15
========================================================
Sort OK
gcsort --help SORT | MERGE | COPY | JOIN print help for specific control statement.
gcsort TAKE filename read filename where are present commands for Sort/Merge.
GCSort uses LIBCOB that defines how made record in write output operation.
Use LSF file organization when the record to be sorted contains trailing spaces and you need fixed-length
records (GCSort does not delete trailing spaces).
Otherwise, you can set the environment variable COB_LS_FIXED=1 before running the GCSort command to
NOT delete trailing spaces.
0 for Success
4 for Warning
16 for Failure
In this case GCSort convert data from a structure to another structure, for example, from Sequential to Line
Sequential or vice versa.
If you want sort a text file (LS) and you don’t know the record length, you can specify RECORD V with max
len very large, example:
GCSORT_MLT Indicate the number of views for MMF in temporary files. This number is multiplied
by Page Size of system (example 65536). Increasing this value the view for read file
in memory is more greater and can reduce the elapsed time.(Temporary files).
By default GCSORT_MLT assume 63 ( Example: 63 * 65536 = 4Mbyte dimension of view for MMF).
The max numbers of temporary files is 16. The temporary files is reused when the size of files input is more
of size of (Memory GCSORT_MEMSIZE * 16 files).
For Error message GCSort break execution and terminate operation with message and return code.
For Warning message GCSort continue execution and continue operation with message.
The message string identify a specific condition of error or warning, in the of warning print a specific action.
16.1. SORT
SORT single file
====================================================================
SORT FIELDS(3,1,CH,A)
USE ../PJTestCaseSort/SQBI01 RECORD F,51 ORG SQ
GIVE ../PJTestCaseSort/SQBI01.SRT.TST RECORD F,51 ORG SQ
====================================================================
Order KEY
1) Position 37, Len 1, Character, Descending
2) Position 18, Len 17, Character, Ascending
Filter only records with character in position 37 Equal ‘C’.
=====================================================
SORT FIELDS=(37,1,CH,D,18,17,CH,A)
INCLUDE COND=(37,1,EQ,C'C') FORMAT=CH
USE FIL_100.TXT RECORD F,3000 ORG LS
GIVE FIL_100.TXT.SRT RECORD F,3000 ORG LS
=====================================================
16.2. MERGE
MERGE
FIELDS=COPY
Copy records from input to output.
Include condition check binary value (low-value)
Pos Len Condition Value
from 305 04 Not Equal Hex '00000000'
16.3. COPY
COPY
16.4. SUMFIELDS
SUMFIELDS
16.5. OUTREC
OUTREC FIELDS/BUILD
16.6. OUTFIL
OUTFIL INCLUDE
Example with more files for OUTFIL
Each file output with Include condition
The purpose is merge files and write four output.
FNAMES=FOUT201_1
FOUT201_1 Environment Variable
FOUT201_2 Environment Variable
FOUT201_3 Environment Variable
FOUT201_SAVE Environment Variable
========================================================================
USE ../FIL_OUTFIL_001.TXT ORG LS RECORD F,3000
GIVE ../FIL_OUTFIL_001.TXT.OUT ORG LS RECORD F,3000
MERGE FIELDS=COPY
OUTFIL INCLUDE=(01,03,CH,EQ,C'201',AND,24,03,CH,LE,C'999'),FNAMES=FOUT201_1
OUTFIL INCLUDE=(01,03,CH,EQ,C'210',AND,24,04,CH,GT,C'0000',AND,24,04,CH,LE,C'9999'),FNAMES=FOUT201_2
OUTFIL INCLUDE=(01,03,CH,EQ,C'230',AND,36,04,CH,GT,C'0000',AND,36,04,CH,LE,C'9999'),FNAMES=FOUT201_3
OUTFIL SAVE,FNAMES=FOUT201_SAVE
========================================================================
OUTFIL OMIT
Format output record
OMIT Condition for input.
FOUTKEY_YES Environment Variable
FOUTKEY_NO Environment Variable
========================================================================
USE D:\GCSORTTEST\FilesT\FIL_OUTFIL_050.txt ORG LS RECORD F,3000
GIVE D:\GCSORTTEST\FilesT\FIL_OUTFIL_050.txt.OUT ORG LS RECORD F,3000
SORT FIELDS=COPY
OUTFIL OMIT=(156,15,CH,LT,141,15,CH,AND,005,10,CH,EQ,C'KEYMAX800E'),FNAMES=FOUTKEY_YES
OUTFIL SAVE,FNAMES=FOUTKEY_NO
END
========================================================================
[ CHANGE - Position ]
OUTREC FIELDS=(1,1,CHANGE=(6,C'2',28,6),NOMATCH=(2,6),X,8,19,35,15,51,59)
COND=(1,13,CH,GT,DATE2+3)
COND=(1,13,CH,GT,DATE2-8)
COND=(1,13,CH,GT,DATE3+150)
COND=(1,13,CH,GT,DATE3-15)
[ DATE4 ]
OMIT COND=(1,13,CH,GT,DATE4)
RECORD TYPE=F,LENGTH=(,,500)