Projet C-Wire Preing2 2024 2025-v1.4
Projet C-Wire Preing2 2024 2025-v1.4
v1.4Projet C-Wire
PréING2 FIELD • 2024-2025
AUTHORS E.ANSERMIN – R.GRIGNON
E-MAILS eva.ansermin@cyu .fr – r omuald.grignon @ c yu .fr
DESCRIPTION ÿ This project requires you to create a program to synthesize data from an electricity distribution
GENERAL system. ÿ To do this, you have at your disposal a .csv file containing a large
set of data detailing the distribution of electricity in France, from power plants, through
distribution substations, to businesses and individuals, who are the end customers
ÿ The data provided are fictitious but the orders of magnitude are equivalent to those of a country which
would be 1/4th of metropolitan France, in terms of production, consumption and number of consumers.
ÿ Your job is to filter and process this data using a shell script (filtering part) and a C
program (processing/calculation part).
DISTRIBUTION ÿ Each power plant sends this electrical energy with a very high
THE ENERGY voltage (~400kV) to several HV-B (High Voltage B) substations which lower the
voltage (> 50kV) and supply energy to a large geographical area.
Each HV-B station transfers the energy to several HV-A (High Voltage A) substations
which further lower this voltage (> 1000V) to provide the energy at a more regional
level.
Then each HV-A substation transfers the energy to several LV (Low-Voltage) stations
which are responsible for distributing the energy to individuals or small businesses
(230V voltage).
ÿ We therefore have a topology which is a rooted tree, therefore a directed graph without
cycles starting from each power plant. The diagram of a tree for a power plant is as
follows:
Machine Translated by Google
The power plant is the root of the tree, which supplies several HV-B stations. These,
distributed in different areas, redistribute the energy to other intermediate stations or directly
to high-energy consumers.
HV-B stations (first level main nodes) • These stations are the
first to receive energy from the power plant.
There may be several HV-B stations, each covering a specific geographical area
or sector. • They supply: ÿ Very energy-intensive
consumers ( such as
large companies using a lot of electricity, e.g. SNCF, steelworks, factories). ÿ HV-
A stations, which support
intermediate consumers.
Machine Translated by Google
FORMAT OF ÿ The DATA_CWIRE.csv file is a file containing the distribution and consumption information
DATA of all the actors mentioned above.
Given their number, this file is very large and it is not possible to process it manually or
using a spreadsheet.
This part explains its content in order to be able to automate its processing.
Machine Translated by Google
ÿ Each line of the file represents an actor or a link between 2 actors of this distribution network
(central, HV-B, HV-A, LV, company or individual)
ÿ Many data in this file will be empty. For example, electricity supplying actors include power
plants, HV-B, HV-A stations and LV substations. In the file, the rows corresponding to these
instances will contain a value in the "Capacity" column (production/transfer capacity), while
the "Load" column (indicating consumption) will be empty. Conversely, for rows relating to
electricity consumers (companies and individuals), the "Load" column will be filled in, and
the "Capacity" column will
remain empty. ÿ Concerning the identifier columns, for each row, the column corresponding
to the type of actor will be filled in with the identifier number of the actor concerned. For
example, a row relating to an HV-A station
will have the "HV-A Station" column filled in with the identifier of this station. In addition, the
identifier of the station that directly supplies it with energy is also indicated (see examples
below). This makes it possible to trace the origin of the energy distributed.
ÿ HV-B station
3;73;-;-;-;-;554172263;- • allows to define an
HV-B node • contains the identifier of the
parent plant (3) • contains the identifier of the HV-B station (73)
• contains the transfer capacity (554172263 kWh)
Machine Translated by Google
ÿ HV-A station
2;31;119;-;-;-;284930294;- • allows to define
an HV-A node • contains the identifier of
the root plant (2) • contains the identifier of the parent HV-
B station (31) • contains the identifier of the HV-A station (119) •
contains the transfer capacity (284930294 kWh)
ÿ LV station
3;-;274;103733;-;-;112040;- • allows to define
an LV node • contains the identifier of
the root plant (3) • contains the identifier of the parent HV-
A station (274) • contains the transfer capacity (112040 kWh)
ÿ LV Consumer (company)
1;-;-;10520;69817;-;-;20659 • allows to define
an LV consumer (leaf) • contains the identifier of the root
plant (1) • contains the identifier of the parent LV station
(10520) • contains the identifier of the consumer (69817) •
contains the customer consumption (20659 kWh)
ÿ LV consumer (individual)
1;-;-;10520;-;411970;-;5270 • allows you to
define an LV consumer (leaf) • contains the identifier of the
root plant (1) • contains the identifier of the parent LV
station (10520) • contains the identifier of the consumer
(411970) • contains the customer consumption (5270 kWh)
Machine Translated by Google
OBJECTIVE ÿ Your project should allow you to analyze the stations (power stations, stations
HV-A stations, HV-B stations, LV stations) to determine whether they are in
situation of overproduction or underproduction of energy, as well as
to assess what proportion of their energy is consumed by the
businesses and individuals.
ÿ Output files
The expected output files should contain on each line, 3
columns:
• station identifier
• capacity in kWh
• consumption in kWh
The expected output file should contain a first line that says
the contents of the columns depending on the parameters that are given:
• HVB station or
Station WHAT or
Station LV
• Ability
• Consumption (businesses) or
Consumption (individuals) or
Consumption (all)
‘
These files will be CSV format files with a character :' (colon)
as a separator, and will carry the name of the station option
(hvb, hva, lv) followed by an underscore '_' followed by the option of
consumers (comp, indiv or all). In the case where there is filtering
additional per central number, then the central number will be
added to the sequel.
Ex: what_comp.csv or lv_all.csv or lv_all_2.csv
ÿ Given the very large number of LV stations, in the case of the lv all options the script will have to
perform additional processing which will be to store in another file only the 10 LV stations with the
most consumption and the 10 LV stations with the least consumption.
This information regarding the 20 LV stations will be sorted by absolute amount of excess energy
consumed (from the most loaded station to the least loaded). In other words, the difference between
the total capacity and the total consumption will be calculated, and the sorting will be done
according to the increasing value of this difference.
ÿ To meet this overall objective, you will have to create a Shell script as well as a C program
whose more specific objectives are detailed in the following sections.
• required •
indicates the location of the input file
what all
what individual
Machine Translated by Google
ÿ The SHELL script must ensure that all mandatory options are present, and that their values
are consistent, before starting a processing. In case of bad options, an error message must
be displayed with the details of the error. In addition to the error message, help must also be
displayed below, as if the user had typed the -h option
ÿ The Shell script will have to check for the presence of the C executable on the hard disk and,
if it is not present, will have to launch the compilation and check that it has been completed
correctly. If this is not the case, an error message must be displayed. Once the compilation
is done, it will be able to perform the processing requested in argument. ÿ The Shell script
will check for the presence of the tmp and graphs folder (see
Additional Information) : if these do not exist, it will have to create them. If the tmp folder
already exists, it will have to empty it before executing the treatments.
ÿ The duration of each processing to be performed must be displayed in seconds at the end.
Regardless of whether the script ran correctly or encountered an error, the durations must
be displayed systematically. The durations will not include the compilation phase of the C
program or the creation of the files carried out at the start. The times displayed must be
useful times for processing the data. If the program fails on the parameter values, no
processing will have been launched, so the time displayed will be 0.0sec.
ÿ Once the data processing is finished, the Shell script will have to create files and/or graphs
that will contain the output data of the processing. For the graphs you can use the GnuPlot
program (see Bonus).
ÿ When the script asks for a station type (hvb, hva, lv), the final goal will be to create a file
containing a list of these stations with the capacity value (the amount of energy it can let
pass) and the sum of consumers connected directly to it.
Machine Translated by Google
ÿ The sum of consumers will only take into account those who are requested
(comp: companies only, indiv: individuals only, all: all consumers).
ÿ To have the sum of the consumers of a station, it will be necessary to build a balanced binary
search tree. This is the role of the part of your project in C. This processing (the sum) must
be done with the C program . If your script performs all or part of this sum in the Shell script,
your grade will be penalized.
PROGRAM C ÿ The goal of this program is to add up the consumption of a type of station. Given the many data and
stations, we will use an AVL in order to limit the processing time. ÿ Each node of the AVL
represents a station and will therefore contain the station identifier as well as
its various data such as its capacity, or the sum of its consumers which will be updated as the
data is read by your program. ÿ The format of the input data of the C program is left free. It
is up to you to see if you pass the entire data file, or if you want to filter it (rows / columns)
using the Shell script before passing the information to the C program. ÿ The format
of the output data of the C program is also left free as long as it allows you to retrieve the
information necessary for your processing.
ÿ The C program aims to calculate the sum of consumptions for a HV-B, HV-A
or LV station, using an AVL tree (balanced binary search tree). A generic
solution must be considered to process these three types of stations with a
program (optimize data transmission). ÿ Any implementation using only an
unbalanced ABR
Additionally, all dynamic allocations must be freed before the C program can terminate
properly. However, if the C program encounters an error, it is not required to free all memory
explicitly.
ÿ Finally, the compilation of the C program will have to be done with the 'make' utility using a
'Makefile'. The only compilation instruction expected by your Shell script is the call to the
'make' executable. No direct call to gcc from your script.
Machine Translated by Google
It will therefore be very difficult, if not impossible, to have a precise overview of this file with
traditional office tools. You will have to use a computer program to extract the data you need.
BONUS ÿ In the case of lv all processing, you are asked as a bonus to create a bar graph of the 10
most loaded LV stations, and the 10 least loaded LV stations. ÿ These 20 bars will be
displayed with an
explicit visual at the color level (red/green) to indicate how much energy is consumed
in excess or if there is margin. ÿ To create a graph in an image, you will use the
command line utility under Linux GnuPlot ÿ You can
add the bonuses of your choice (other option, interface, etc.) as long as the rest of the
specifications seem fulfilled to you.
CRITERIA OF ÿ The work will be delivered as a github link to the project. No need to send files by email/
NOTATION Teams or any other means: the only deliverable expected and that will be evaluated will
be the link to the git repository in which all the files of your project must be located. Before
the delivery date you can set this repository to "private" to prevent other people from
plagiarizing you.
By the due date, this repository must be publicly visible so that your project managers
can freely access it. Any delay in accessing this public repository will have a negative
impact on your final grade.
ÿ The code repository will contain, in addition to the code files, a ReadMe text file containing
instructions for compiling and using your application. It will also contain a PDF document
presenting the distribution of tasks within the group, the implementation schedule, and
the functional limitations of your application (the list of what is not implemented, and/or
what is implemented but does not work correctly/completely).
Machine Translated by Google
ÿ You are required to make a delivery to the repository at least once after each tutorial session
on the project, even if the code repository is not functional.
If you have changes to your code from one session to another, you will need to have
made a commit the day before the session, to properly record the history of your
modifications, and to be able to work on them with your TD supervisor.
In fact, at a minimum there will be a commit just before and just after the session.
If you have modifications, even non-functional ones, they should not remain several days
without being stored on the code repository, and thus avoid a catastrophic loss in the
event that your machine breaks down. It is your responsibility to regularly archive your
modifications on the repository. No loss of code can be taken into account if you do not
use your repository.
This will also force you to develop as a team with a central repository that you share and
that contains the latest changes to the project.
ÿ The rendering is a group work: if similarities between groups are found, and/or if examples
available on the Internet are discovered without being sourced, an exam cheating procedure
may be considered. The educational goal of this project is for you to make this program
yourself, and to master all of the code provided.
ÿ The C language code will be separated into modules (.c and .h files, sub-
files).
ÿ The code symbols (variables, functions, types, files, etc.) will be in the same language as
the comments (either all in English or all in French, but no mixing between the languages
used).
ÿ The C program and the Shell script must follow all the guidelines
described in this document.
Machine Translated by Google
ÿ The C program and shell script must not generate any unexpected errors. There
must be no segmentation errors, syntax errors, or unknown command
names , etc.
This criterion will be extremely punitive on the final grade: it is therefore your
responsibility to test your program properly to be able to correct these situations
before delivery. ÿ If an error is detected by your
program/script, an error message must be displayed to indicate the cause to the
user, and a return code with a strictly positive value must be returned.
ÿ You have a CSV data file that is fixed. It is therefore possible for a group of students to "hard code"
the expected results. To avoid this case of cheating, it is possible that the evaluation of your
program is done with a CSV data file different from yours (but similar in terms of structure, size,
etc.). So think about making a truly generic program to avoid a bad surprise during the evaluation.
RESOURCES GitHub
USEFUL ÿ site Web : https://github.com/ Format
CSV ÿ site
Web : https://fr.wikipedia.org/wiki/Comma-separated_values GnuPlot