0% found this document useful (0 votes)
12 views12 pages

Projet C-Wire Preing2 2024 2025-v1.4

C Programming

Uploaded by

Karan malhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views12 pages

Projet C-Wire Preing2 2024 2025-v1.4

C Programming

Uploaded by

Karan malhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Machine Translated by Google

v1.4Projet C-Wire
PréING2 FIELD • 2024-2025
AUTHORS E.ANSERMIN – R.GRIGNON
E-MAILS eva.ansermin@cyu .fr – r omuald.grignon @ c yu .fr

DESCRIPTION ÿ This project requires you to create a program to synthesize data from an electricity distribution
GENERAL system. ÿ To do this, you have at your disposal a .csv file containing a large
set of data detailing the distribution of electricity in France, from power plants, through
distribution substations, to businesses and individuals, who are the end customers

ÿ The data provided are fictitious but the orders of magnitude are equivalent to those of a country which
would be 1/4th of metropolitan France, in terms of production, consumption and number of consumers.

ÿ Your job is to filter and process this data using a shell script (filtering part) and a C
program (processing/calculation part).

DISTRIBUTION ÿ Each power plant sends this electrical energy with a very high
THE ENERGY voltage (~400kV) to several HV-B (High Voltage B) substations which lower the
voltage (> 50kV) and supply energy to a large geographical area.

Each HV-B station transfers the energy to several HV-A (High Voltage A) substations
which further lower this voltage (> 1000V) to provide the energy at a more regional
level.
Then each HV-A substation transfers the energy to several LV (Low-Voltage) stations
which are responsible for distributing the energy to individuals or small businesses
(230V voltage).

ÿ We therefore have a topology which is a rooted tree, therefore a directed graph without
cycles starting from each power plant. The diagram of a tree for a power plant is as
follows:
Machine Translated by Google

A tree representing a power plant is therefore composed of:

ÿ The Central (root of the tree)

The power plant is the root of the tree, which supplies several HV-B stations. These,
distributed in different areas, redistribute the energy to other intermediate stations or directly
to high-energy consumers.

ÿ Hierarchical intermediate stations

HV-B stations (first level main nodes) • These stations are the
first to receive energy from the power plant.
There may be several HV-B stations, each covering a specific geographical area
or sector. • They supply: ÿ Very energy-intensive
consumers ( such as
large companies using a lot of electricity, e.g. SNCF, steelworks, factories). ÿ HV-
A stations, which support

intermediate consumers.
Machine Translated by Google

HV-A stations (secondary nodes)


• Each HV-A station is powered by an HV-B station and
serves medium or large companies, with a
moderate consumption (such as shopping malls or
less energy-intensive industrial zones).

There may be multiple HV-A stations.

LV stations (low voltage stations)


• Each LV station is powered by an HV-A station. It transforms
energy to make it compatible with the low needs of
end users (individuals, small businesses).

There are many LV stations, often close to the areas
residential and small commercial areas.

ÿ Final consumers (tree leaves)

• Large energy-intensive companies are directly connected to


des stations HV-B.
• Medium/large companies and some shopping centers
are served directly by HV-A stations.
• Individuals and small businesses are supplied via the
LV posts. These are also leaves of the tree, where the energy is
consumed without further redistribution.

Each LV station is connected to a single HV-A station, each HV-A station is


connected to a single HV-B station, and each HV-B station is connected
to a single central station. There are multiple central stations.

ÿ In all the data available, we have:



5 central
• ~118 sous-stations HV-B
• ~512 sous-stations HV-A
• +180k posts 50
• +1.25M consumers (businesses)
• +7.6M consumers (individuals)

ÿ Each of these actors (energy distributor or consumer) will be associated with an


identification number which is unique in its category.

FORMAT OF ÿ The DATA_CWIRE.csv file is a file containing the distribution and consumption information
DATA of all the actors mentioned above.
Given their number, this file is very large and it is not possible to process it manually or
using a spreadsheet.
This part explains its content in order to be able to automate its processing.
Machine Translated by Google

ÿ The data file is a CSV file containing the column names


following: ÿ
Power plant : identifier of a power plant (producer) ÿ HV-B Station : identifier of
an HV-B station
ÿ HV-A Station : identifier of an HV-A station ÿ LV Station :
identifier of an LV station ÿ Company : identifier of a
company (consumer) ÿ Individual : identifier of
an individual (consumer) ÿ Capacity : quantity of energy produced by a
power plant or transferred by an HV-B, HV-A or LV station (in kWh) ÿ
Load : quantity of energy consumed by the final consumer (company or individual)
(in kWh)

ÿ Each line of the file represents an actor or a link between 2 actors of this distribution network
(central, HV-B, HV-A, LV, company or individual)

ÿ Many data in this file will be empty. For example, electricity supplying actors include power
plants, HV-B, HV-A stations and LV substations. In the file, the rows corresponding to these
instances will contain a value in the "Capacity" column (production/transfer capacity), while
the "Load" column (indicating consumption) will be empty. Conversely, for rows relating to
electricity consumers (companies and individuals), the "Load" column will be filled in, and
the "Capacity" column will
remain empty. ÿ Concerning the identifier columns, for each row, the column corresponding
to the type of actor will be filled in with the identifier number of the actor concerned. For
example, a row relating to an HV-A station
will have the "HV-A Station" column filled in with the identifier of this station. In addition, the
identifier of the station that directly supplies it with energy is also indicated (see examples
below). This makes it possible to trace the origin of the energy distributed.

ÿ Examples of lines from the CSV file ÿ Plant

1;-;-;-;-;-;17972235418;- • allows you to define the


root of one of the trees • contains the plant identifier (1) •
contains the maximum production (17972235418
kWh)

ÿ HV-B station
3;73;-;-;-;-;554172263;- • allows to define an
HV-B node • contains the identifier of the
parent plant (3) • contains the identifier of the HV-B station (73)
• contains the transfer capacity (554172263 kWh)
Machine Translated by Google

ÿ HV-B Consumer (company)


1;17;-;-;23;-;-;123823310 • allows to define
an HV-B consumer (leaf) • contains the identifier of the root
plant (1) • contains the identifier of the parent HV-B
station (17) • contains the identifier of the consumer (23) • contains
the customer consumption (123823310 kWh)

ÿ HV-A station
2;31;119;-;-;-;284930294;- • allows to define
an HV-A node • contains the identifier of
the root plant (2) • contains the identifier of the parent HV-
B station (31) • contains the identifier of the HV-A station (119) •
contains the transfer capacity (284930294 kWh)

ÿ HV-A Consumer (company)


2;-;117;-;2211;-;-;3389540 • allows to define
an HV-A consumer (leaf) • contains the identifier of the root
plant (2) • contains the identifier of the parent HV-A
station (117) • contains the identifier of the consumer (2211) •
contains the consumption of the customer (3389540 kWh)

ÿ LV station
3;-;274;103733;-;-;112040;- • allows to define
an LV node • contains the identifier of
the root plant (3) • contains the identifier of the parent HV-
A station (274) • contains the transfer capacity (112040 kWh)

ÿ LV Consumer (company)
1;-;-;10520;69817;-;-;20659 • allows to define
an LV consumer (leaf) • contains the identifier of the root
plant (1) • contains the identifier of the parent LV station
(10520) • contains the identifier of the consumer (69817) •
contains the customer consumption (20659 kWh)

ÿ LV consumer (individual)
1;-;-;10520;-;411970;-;5270 • allows you to
define an LV consumer (leaf) • contains the identifier of the
root plant (1) • contains the identifier of the parent LV
station (10520) • contains the identifier of the consumer
(411970) • contains the customer consumption (5270 kWh)
Machine Translated by Google

OBJECTIVE ÿ Your project should allow you to analyze the stations (power stations, stations
HV-A stations, HV-B stations, LV stations) to determine whether they are in
situation of overproduction or underproduction of energy, as well as
to assess what proportion of their energy is consumed by the
businesses and individuals.

ÿ This objective requires that:


ÿ
that the user can easily define his parameters
observation (the type of station he wishes to analyze) and the
customer categories to be examined.
ÿ
that your project filters the relevant information in the
.csv file.
ÿ
that your solution calculates the sum of the consumption of all
customers associated with these stations.
ÿ
that the results are saved in a structured manner (see
the description of the output files below).

ÿ Output files
The expected output files should contain on each line, 3
columns:
• station identifier
• capacity in kWh
• consumption in kWh

The expected output file should contain a first line that says
the contents of the columns depending on the parameters that are given:
• HVB station or
Station WHAT or
Station LV
• Ability
• Consumption (businesses) or
Consumption (individuals) or
Consumption (all)

These files will be CSV format files with a character :' (colon)
as a separator, and will carry the name of the station option
(hvb, hva, lv) followed by an underscore '_' followed by the option of
consumers (comp, indiv or all). In the case where there is filtering
additional per central number, then the central number will be
added to the sequel.
Ex: what_comp.csv or lv_all.csv or lv_all_2.csv

This data will be sorted by increasing Capacity (smallest capacity first).


Machine Translated by Google

ÿ Given the very large number of LV stations, in the case of the lv all options the script will have to
perform additional processing which will be to store in another file only the 10 LV stations with the
most consumption and the 10 LV stations with the least consumption.

This information regarding the 20 LV stations will be sorted by absolute amount of excess energy
consumed (from the most loaded station to the least loaded). In other words, the difference between
the total capacity and the total consumption will be calculated, and the sorting will be done
according to the increasing value of this difference.

The name of this file will be lv_all_minmax.csv.

ÿ To meet this overall objective, you will have to create a Shell script as well as a C program
whose more specific objectives are detailed in the following sections.

SHELL SCRIPT ÿ Your Shell script will be named c-wire.sh


Your Shell script will take as a parameter the path of the input CSV file containing the data. It will
also take other parameters which will be the choices of the treatments to be done. The different
treatments depending on the parameters are described later in this
section.

ÿ The different parameters of the script are as follows (in order) :


ÿ data file path

• required •
indicates the location of the input file

ÿ type of station to be processed •


mandatory •
possible values:
hvb (high-voltage B) hva
(high-voltage A) lv (low-
voltage)

ÿ type of consumer to be processed •


mandatory •
possible values: comp
(companies) indiv
(individuals) all (all) •
ATTENTION:
the following options are prohibited because only companies are connected to the
HV-B and HV-A stations: hvb all hvb indiv

what all
what individual
Machine Translated by Google

ÿ Plant ID : • optional • filters


results for a
specific plant

if this option is absent, the processing will be carried out on all the power
stations in the file

ÿ Help option (-h) :


• optional and priority

if present, all other options are ignored, regardless of the position of the
help option • displays detailed help on using
the script (description, features, possible options, order of parameters, etc.)

ÿ The SHELL script must ensure that all mandatory options are present, and that their values
are consistent, before starting a processing. In case of bad options, an error message must
be displayed with the details of the error. In addition to the error message, help must also be
displayed below, as if the user had typed the -h option

ÿ The Shell script will have to check for the presence of the C executable on the hard disk and,
if it is not present, will have to launch the compilation and check that it has been completed
correctly. If this is not the case, an error message must be displayed. Once the compilation
is done, it will be able to perform the processing requested in argument. ÿ The Shell script
will check for the presence of the tmp and graphs folder (see

Additional Information) : if these do not exist, it will have to create them. If the tmp folder
already exists, it will have to empty it before executing the treatments.

ÿ The duration of each processing to be performed must be displayed in seconds at the end.
Regardless of whether the script ran correctly or encountered an error, the durations must
be displayed systematically. The durations will not include the compilation phase of the C
program or the creation of the files carried out at the start. The times displayed must be
useful times for processing the data. If the program fails on the parameter values, no
processing will have been launched, so the time displayed will be 0.0sec.

ÿ Once the data processing is finished, the Shell script will have to create files and/or graphs
that will contain the output data of the processing. For the graphs you can use the GnuPlot
program (see Bonus).

ÿ When the script asks for a station type (hvb, hva, lv), the final goal will be to create a file
containing a list of these stations with the capacity value (the amount of energy it can let
pass) and the sum of consumers connected directly to it.
Machine Translated by Google

ÿ The sum of consumers will only take into account those who are requested
(comp: companies only, indiv: individuals only, all: all consumers).

ÿ To have the sum of the consumers of a station, it will be necessary to build a balanced binary
search tree. This is the role of the part of your project in C. This processing (the sum) must
be done with the C program . If your script performs all or part of this sum in the Shell script,
your grade will be penalized.

PROGRAM C ÿ The goal of this program is to add up the consumption of a type of station. Given the many data and
stations, we will use an AVL in order to limit the processing time. ÿ Each node of the AVL
represents a station and will therefore contain the station identifier as well as
its various data such as its capacity, or the sum of its consumers which will be updated as the
data is read by your program. ÿ The format of the input data of the C program is left free. It
is up to you to see if you pass the entire data file, or if you want to filter it (rows / columns)
using the Shell script before passing the information to the C program. ÿ The format
of the output data of the C program is also left free as long as it allows you to retrieve the
information necessary for your processing.

ÿ The C program aims to calculate the sum of consumptions for a HV-B, HV-A
or LV station, using an AVL tree (balanced binary search tree). A generic
solution must be considered to process these three types of stations with a
program (optimize data transmission). ÿ Any implementation using only an
unbalanced ABR

will result in a penalty.


ÿ The C program must return an error code with a strictly positive value if a problem is
encountered, and 0 otherwise. This program must never stop unexpectedly: it is up to you to
check that all the data you process is correct. If you detect a problem, you must stop the
program and return an error code.

ÿ The C program code must therefore be robust. ÿ You are also


asked to limit the memory size used as much as possible. To do this, you will have to define
variables in your structures with the smallest possible memory footprint, while guaranteeing
the requested functionality of course.

Additionally, all dynamic allocations must be freed before the C program can terminate
properly. However, if the C program encounters an error, it is not required to free all memory
explicitly.

ÿ Finally, the compilation of the C program will have to be done with the 'make' utility using a
'Makefile'. The only compilation instruction expected by your Shell script is the call to the
'make' executable. No direct call to gcc from your script.
Machine Translated by Google

INFORMATION ÿ The CSV data file The “data.csv”


SUPPL. file provided contains all the data.
It is a large file (+150MB) with more than 5 million lines.

It will therefore be very difficult, if not impossible, to have a precise overview of this file with
traditional office tools. You will have to use a computer program to extract the data you need.

ÿ Organizing your project files


The input data file will have to be copied into an 'input' folder The C program and all
related files (makefile, executable, …) will have to be located in a 'codeC' folder The
graphs, if any, will be stored in images on the hard disk in a 'graphs'
folder The intermediate files needed for your application (if any) will be placed in a 'tmp'
folder.

The results of previous runs will be in the 'tests' folder.


The Shell script will be placed at the root of your project.

BONUS ÿ In the case of lv all processing, you are asked as a bonus to create a bar graph of the 10
most loaded LV stations, and the 10 least loaded LV stations. ÿ These 20 bars will be
displayed with an
explicit visual at the color level (red/green) to indicate how much energy is consumed
in excess or if there is margin. ÿ To create a graph in an image, you will use the
command line utility under Linux GnuPlot ÿ You can
add the bonuses of your choice (other option, interface, etc.) as long as the rest of the
specifications seem fulfilled to you.

CRITERIA OF ÿ The work will be delivered as a github link to the project. No need to send files by email/
NOTATION Teams or any other means: the only deliverable expected and that will be evaluated will
be the link to the git repository in which all the files of your project must be located. Before
the delivery date you can set this repository to "private" to prevent other people from
plagiarizing you.

By the due date, this repository must be publicly visible so that your project managers
can freely access it. Any delay in accessing this public repository will have a negative
impact on your final grade.

ÿ The code repository will contain, in addition to the code files, a ReadMe text file containing
instructions for compiling and using your application. It will also contain a PDF document
presenting the distribution of tasks within the group, the implementation schedule, and
the functional limitations of your application (the list of what is not implemented, and/or
what is implemented but does not work correctly/completely).
Machine Translated by Google

ÿ You are required to make a delivery to the repository at least once after each tutorial session
on the project, even if the code repository is not functional.

If you have changes to your code from one session to another, you will need to have
made a commit the day before the session, to properly record the history of your
modifications, and to be able to work on them with your TD supervisor.

In fact, at a minimum there will be a commit just before and just after the session.

If you have modifications, even non-functional ones, they should not remain several days
without being stored on the code repository, and thus avoid a catastrophic loss in the
event that your machine breaks down. It is your responsibility to regularly archive your
modifications on the repository. No loss of code can be taken into account if you do not
use your repository.

This will also force you to develop as a team with a central repository that you share and
that contains the latest changes to the project.

ÿ Your render will contain examples of running your application.


You will put in the 'tests' folder, the images, intermediate and final files, and you will
present these results in the PDF document mentioned above. These elements must
therefore be reproducible using your program.

ÿ The rendering is a group work: if similarities between groups are found, and/or if examples
available on the Internet are discovered without being sourced, an exam cheating procedure
may be considered. The educational goal of this project is for you to make this program
yourself, and to master all of the code provided.

ÿ The C language code will be separated into modules (.c and .h files, sub-
files).

Your C program should not be a single piece. ÿ A Makefile will be


present and it will allow you to compile the executable.
The first target will compile the project. This file will include among other things a 'clean'
target which will delete the generated files.

ÿ Your code will be commented (modules, functions, structures,


constants, ...) and correctly indented.

ÿ The code symbols (variables, functions, types, files, etc.) will be in the same language as
the comments (either all in English or all in French, but no mixing between the languages
used).

ÿ The C program and the Shell script must follow all the guidelines
described in this document.
Machine Translated by Google

ÿ The C program and shell script must not generate any unexpected errors. There
must be no segmentation errors, syntax errors, or unknown command
names , etc.
This criterion will be extremely punitive on the final grade: it is therefore your
responsibility to test your program properly to be able to correct these situations
before delivery. ÿ If an error is detected by your
program/script, an error message must be displayed to indicate the cause to the
user, and a return code with a strictly positive value must be returned.

ÿ Structures temporarily allocated in your program must be deallocated explicitly


before the end. The amount of memory not freed at the end of execution will
be evaluated. Remember to call the free(...) function when necessary! ÿ In
addition, the amount of
RAM consumed by your program will also be noted: so remember to limit your
memory footprint.
Be careful though to make sure your program works: memory optimization is
still a plus. The first step is to make something functional.

ÿ You have a CSV data file that is fixed. It is therefore possible for a group of students to "hard code"
the expected results. To avoid this case of cheating, it is possible that the evaluation of your
program is done with a CSV data file different from yours (but similar in terms of structure, size,
etc.). So think about making a truly generic program to avoid a bad surprise during the evaluation.

RESOURCES GitHub
USEFUL ÿ site Web : https://github.com/ Format
CSV ÿ site
Web : https://fr.wikipedia.org/wiki/Comma-separated_values GnuPlot

ÿ site Web : http:// gnuplot.info/

Eva ANSERMIN – Romuald GRIGNON • 2024-2025 • préING2


eva.ansermin@cyu.fr - romuald.grignon @ c yu.fr

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy