File Handling - 5
File Handling - 5
CHAPTER 5
Introduction
A file in itself is a bunch of bytes stored on some storage device like
hard-disk, thumb drive etc.
DATA FILES: The data files are the files that stores data pertaining to a
specific application, for later use.
The data files can be stored in two ways:
i) Text files ii) Binary files
Text Files : A text file store information in ASCII or Unicode.
In text files, each line of text is terminated with a special character
known as EOL character.
In Python, by default, by default , this EOL character is the new line
character(‘\n’) or carriage-return, new line combination(‘\r\n’)
Binary Files : A binary file just a file that contains information in the
same format in which the information is held in memory.
In, binary file, there is no delimiter for a line and no translations occur
in this.
As a result, binary files are faster and easier for a program to read and
write than text files.
OPENING AND CLOSING FILES
• In order to work with a file from within a Python program, we need to
open it in a specific mode.
• The most basic file manipulation tasks include adding, modifying or
deleting data in a file.
➢Reading data from files
➢Writing data to file
➢Appending data to files.
Above file opens in file mode as read mode( this is default mode).
File=open(“class12.txt”, “r”)
Python’s open() function creates a file object which serves as a link to a file on our
system.
The first parameter for the open() is a path to the file which has to be open
The second parameter of the open function corresponds to a mode :read (‘r’),
write(‘w’) or append (‘a’). [ If no second parameter is given, then by default it open it in read ‘r’ mode.]
F=open( r “c:\ data.txt ” , ‘r’) With r here, we can give single
slashes in pathnames
The prefix r in front of a string makes it raw string that means there is no special meaning attached
to any character.
F=open( “c:\ data.txt ” , “r”)
Thus the two ways to give paths in filenames correctly are :
This might give incorrect result
i) Double the slashes e.g., as \t is tab character
f=open(“c:\\temp\\data.txt”, “r”)
ii) Give raw string by prefixing the file-path string with r e.g.
f=open(r “c:\temp\data.txt”,”r”)
File Object/ File Handle
File objects are used to read and write data to a file on disk.
It is used to obtain a reference to the file on disk and open it for a
number of different tasks.
File object (file-handle) is very important and useful tool as through a
file-object only.
All the functions that we perform on a data file are performed through
file-objects.
f=open(“school.txt”,r)
File Access Modes
Python needs to know the file-mode in which the file is being opened.
File mode governs the type of operations(read ,write or append) possible in the opened file.
Text File Binary File Description Notes
Mode Mode
‘r’ ‘rb’ read only File must exits already , else Python raises I/O error
‘w’ ‘wb’ write only • If the file does not exits, file is created.
• If the file exits, Python will truncate existing data and over write in the
file.
‘a’ ‘ab’ append File is in write mode
If the file exits, the data is retained and new data being written will appended
to the end.
If the file does not exits, new file will be created.
‘r+’ ‘r+b’ or ‘rb+’ read and write File must exits otherwise error is raised.
Both reading and writing operations can take place.
‘w+’ ‘w+b’ or ‘wb+’ write and read File is created if does not exit.
If file exits, file is truncated.
Both reading and writing operations can take place
‘a+’ ‘a+b’ or ‘ab+’ write and read File is created if does not exits.
If file exits, file’s existing data is retained; new data is appended.
Both reading and writing operations can take place
A file-mode governs the type of operations(e.g., read/write/append) possible
in the opened file.
Closing Files: An open file is closed by calling close() method of its file-
object. It is an important step.
Though files are automatically closed at the end of the program but still
do it explicitly.
<fileHandle>.close() The close() must be used with
filehandle
outfile.close()
The close() function breaks the link of file-object and the file on the disk.
After close(), no task can be performed on that file through file-object.
READING AND WRITING FILE
Number of functions are there for reading and writing the open file.
1. Reading from files: three types of read function to read data from
file. File must be opened and linked via file-object (file-handle)
read() filehandle.read([n]) ( reads at most n bytes; f no n is specified, read the entire file)
file.read(20)
20 bytes read
readline() filehandle.readline([n]) (reads a line of input; if n is specified reads at most n bytes.
Return the read bytes in the form of a string)
file1=open(“info.txt”)
readinfo=file1.readline() 1 line read
print(readinfo)
If we want to retain the old data, then we should open the file in “a” or
append mode.
A file opened in append mode retains its previous data while allowing
to add new data into.
We can also add a plus symbol (+) with file read mode to facilitates
reading as well as writing.
In Python, writing in file can take place in following form:
i) In an existing file, while retaining its content
a) if the file has been opened in append mode to retain old content.
b) if the file has been open in ‘r+’ or ‘a+’ mode to read as well write.
ii) To create a new file or to write on an existing file after
truncating/overwriting its old content
a) If the file has been opened in write –only mode(“w”)
b) If the file has been open in ‘w+’ mode to facilitates writing as well
as reading.
iii) Make sure to se close() function on file-object after finished writing.
The flush() function
When we write onto a file using any of the write functions, Python hold
everything to write in the file in buffer and pushes it onto actual file on
storage device a later.
If we want to force Python to write the content of buffer onto storage ,
we will use flush() function.
Python automatically flushes the files buffer when closing them i.e. this
function is implicitly called by the close function.
The flush() function forces the writing of data on disc still pending in
output buffer.
<fileobject>.flush()
f=open(‘out.log’, ‘w+’)
f.write(‘the output is \n’)
f.write(“my”+ “work-status”+ “is”) With this statement, the strings written so
far, i.e. ‘te output is’ and ‘my work-status is ‘
have been pushed onto actual file on disk
f.flush()
S=‘ok’
f.write(s)
f.write(‘\n’)
f.write(‘finally over \n’)
f.flush() The flush() function ensure that whatever is held in
output buffer, is written on to the actual file on disk
f.close()
Removing whitespaces after Reading from file.
The read() and readline() functions discussed above , read data from
file and return it in string form and readlines() function returns the
entire file content in a list where each line is one item of the
list.
All these read functions also read the leading and trailing whitespace.
If we want to remove any of these trailing and leading whitespaces, we
can use strip() functions.
• the strip() removes the given character from both ends.
• the rstrip() removes the given character from trailing end i.e. right end.
• the lstrip() removes the given character from trailing end i.e. left end.
i) Removing EOL ‘\n’ character from the line read from the file
f=file(“poem.txt”,r”)
line=f.readline()
line=line.rstrip(‘\n’)
ii) Removing the leading whitespaces from the line read from the file
line=file(“poem.txt”,r”).readline()
line=line.lstrip(‘\n’)
Significance of File Pointer in File Handling
Every file maintains a file pointer which tells the current position in the
file where writing or reading will take place.
When we read something from the file or write onto a file, then these
two things happen involving file-pointer:
i) This operation takes place at the position of file-pointer and
ii) File-pointer advances by the specified number of bytes.
File Modes and the opening Position of File-pointer
In Python, we can use these standard stream files by using sys module.
After importing , we can use these standard streams
(stdin,stdout,stderr) in the same way we use other files.
Standard Input, Output devices as files.
• After importing sys module in our program, sys.stdin.read() would
read from keyboard. This is so keyboard is the standard input device
linked to sys.stdin.
• sys.stdout.write() would let us write on standard output device, the
monitor.
• sys.stdsin and sys.stdout are standard input/output devices
respectively.
• Sys.stdin and sys.stdout are opened by the Python when Python
starts. Sys.stdin in read mode and sys.stdout in write mode.
import sys
fh=open(r “c:\t.txt”)
line1=fh.readline()
line2=fh.readline()
sys.stdout.write(line1)
sys.stdout.write(line2)
sys.stderr.write(“ no errors “)
Absolute and Relative Path
Full name of a file or directory or folder consist of
path\primaryname.extension
Path is a sequence of directory names which give the hierarchy to
access a particular directory of file name.
The absolute paths are from the topmost level of the directory
structure.
E:\accounts\history\cast.act
The relative paths are relative to current working directory denoted as
a dot(.) while its parent directory is denoted with two dots(..)
. .\PROJ1\REPORT.PRG
BINARY FILE OPERATIONS
Files that store objects as some byte steam are the binary files.
Objects have a specific structure which must be maintained while
storing or accessing them.
Python provides a special module-the pickle module.
“The pickle module implements a fundamental but powerful algorithm
for serializing and de-serializing a Python objects structure.
“Pickling” is the process whereby a Python objects hierarchy is
converted into a byte-stream and “unpickling” is the inverse operation,
whereby a byte-stream is converted back into an object hierarchy”
In order to work with pickle module, we have to import it in our
program using import statement.
import pickle
Dump() and load() methods of pickle module to write and read from
file respectively.
Inserting and appending record into a binary file requires importing
pickle module in our program followed by dump() method to write
onto the file..
#program for inserting/appending a record in a binary file-student
import pickle
record=[]
while True:
roll_no=int(input("enter student roll no .."))
name=input("enter the student Name ")
marks=int(input("enter the marks obtained"))
data=[roll_no,name,marks]
record.append(data)
choice=input(" wish to enter more records Y/N ?")
if choice.upper()=='N':
break
f=open("student.txt", "wb")
pickle.dump(record,f)
print("record added")
f.close()
# Program to read a record from the binary file-"student.txt"
import pickle
f=open("student.txt","rb")
stud_rec=pickle.load(f)
print("Content of student file are ")
for R in stud_rec:
roll_no=R[0]
name=R[1]
marks=R[2]
print(roll_no, name,marks)
f.close()
#Program to search a record from the binary file "student.txt" on the basis of roll number
import pickle
f=open("student.txt","rb")
stud_rec=pickle.load(f)
found=0
rno=int(input("enter the roll number to search"))
for R in stud_rec:
if R[0]==rno:
print("successful search", R[1], "found")
found=1
break
if found ==0:
print( " sorry, record not found")
f.close()
#program to update the name of the student from the binary file
import pickle
f=open("student.txt","rb+")
stud_rec= pickle.load(f)
found=0
roll_no=int(input("enter the roll number to search"))
for R in stud_rec:
rno=R[0]
if rno== roll_no:
print(" current name is : ", R[1])
R[1]= input("new Name")
found=1
break
if found ==1:
f.seek(0) # file pointer to the beginning of the file
pickle.dump(stud_rec,f)
print(" Name updated !!!")
f.close()
RANDOM ACCESS IN FILE USING TELL() AND SEEK()
For random access of the data built-in methods seek() and tell() are used.
seek()-this function is used to change the position of the file handle(file
pointer) to a given specific position. (file pointer is like a cursor)
Python file method seek() sets the file current’s position at the offset.
(This argument is optional and default to 0 means absolute file position)
Other values are 1 , which signifies seek is relative to the current position and 2 which means seek is
relative to the end of file.
The reference point is defined by the “ from_what” argument. It can have any 3 values:
0: sets the reference point at the beginning of the file, which is default
1: sets the reference point at the current file position
2: sets the reference point at the end of the file
seek() can be done in two ways
• Absolute Positioning
• Relative Positioning
• Absolute referencing using seek() gives the file number on which the file
pointer has to position itself. The syntax for seek() is :
f.seek(file_location)
f.seek(20) will give the position or file number where the file pointer has
been placed.
f.seek(-10,1) from current position , move 10 bytes backward.
f.seek(10,1) from current position , move 10 bytes forward.
f.seek(-30,1) from current position , move 30 bytes backward.
f.seek(10,0) from current position , move 10 bytes forward.
Tell()-it returns the current position of the file read/write pointer within
the file.
f.tell() # where f is file pointer
• When we open a file in reading/ writing mode, the file pointer rests
at 0th bytes.
• When we open a file in append mode, the file pointer rests at the last
byte.
INTRODUCTION TO CSV
Data sharing is one of the major tasks to be carried out, largely theough
spreadsheets or databases.
A basic approach to share data is through the comma separated values
(CSV) file.
CSV is a simple flat file in a human readable format which is extensively
used to store tabular data in a spreadsheet.
A CSV file stores tabular data(number and text)in plain text.
Files in this format can be imported to and exported from programs
that store data in tables such a Micrsoft excel etc.
It is used for storing tabular
CSV stands for comma separated
data in spreadsheet or
values
database
CSV