0% found this document useful (0 votes)
18 views

Sample Final Project Documentation

Uploaded by

Dinesh Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Sample Final Project Documentation

Uploaded by

Dinesh Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 87

A SECURE FRAMEWORK FOR CLIENT SIDE

DEDUPLICATION IN BIGDATA APPLICATIONS

A PROJECT REPORT

Submitted by

KOWSALYA.A (815416104005)
MANIMEGALAI.S (815416104006)
MEENA.P (815416104007)

in partial fulfillment for the award of the degree


of

BACHELOR OF ENGINEERING

IN

COMPUTER SCIENCE AND ENGINEERING

SRI RAMAKRISHNA COLLEGE OF ENGINEERING

ANNA UNIVERSITY : CHENNAI 600 025

APRIL/MAY 2024
ANNA UNIVERSITY : CHENNAI 600 025

BONAFIDE CERTIFICATE

Certified that this project report “A SECURE FRAMEWORK FOR

CLIENT SIDE DEDUPLICATION IN BIGDATA APPLICATIONS” is the

bonafide work of “KOWSALYA. A, MANIMEGALAI. S, MEENA. P” who

carried out the project work under my supervision. No part of the dissertation has

been submitted for any degree or any other academic award anywhere before.

SIGNATURE SIGNATURE
Mr. R.DINESH RAJ M.E., Mrs. C.SURYA M.E.,
HEAD OF THE DEPARTMENT ASSISTANT PROFESSOR
Department of CSE Department of CSE
Sri Ramakrishna College of Sri Ramakrishna College of
Engineering Engineering
Perambalur Perambalur

Submitted for the project viva-voce held on……………….

INTERNAL EXAMINER EXTERNAL EXAMINER

ii
ACKNOWLEDGEMENT

Our first and foremost thanks goes to our beloved Secretary Sir,
Er.M.S.VIVEKANANDAN for providing us with excellent lab facilities and
constantly encouraging us to pursue new goals and ideas.

We would like to thank our Principal Sir, Dr. M. MARIMUTHU, for his
help and constant support to make our project. We thank for his constant attention
and guidance which has been instrumental in making this projects a reality.

We would like to thank Mr. R.DINESHRAJ, Head of the Department of


Computer Science and Engineering for his support and his valuable guidance,
suggestions and constant encouragement paved way for the successful completion
of this project work.

It is our responsibility to thank our project guide Mrs. C. Surya, Assistant


Professor, the Department of Computer Science and Engineering, deserves a
special vote of thanks for her constant inspiration that she has been all through the
project period.

We also thank the faculty members of the Department of Computer Science


and Engineering, Sri Ramakrishna College of Engineering, Perambalur for their
remarkable help in completing this project.

We thank all our friends who have very understood, co-operative and
appreciative and also understood with us as pillars of support during our good and
bad times. On the whole, we express our heartfelt gratefulness to our parents
without whom we cannot be shaped up to this in our career. Once again we thank
one and all who have helped us in completing this dissertation.

iii
ABSTRACT

Data deduplication is one of important data compression techniques for


eliminating duplicate copies of repeating data, and has been widely used in cloud
storage to reduce the amount of storage space and save bandwidth. To protect the
confidentiality of sensitive data while supporting deduplication, the convergent
encryption technique has been proposed to encrypt the data before outsourcing. To
better protect data security, this paper makes the first attempt to formally address
the problem of authorized data deduplication. Different from traditional
deduplication systems, the differential privileges of users are further considered in
duplicate check besides the data itself. We also present several new deduplication
constructions supporting authorized duplicate check in hybrid cloud architecture.
Security analysis demonstrates that our scheme is secure in terms of the definitions
specified in the proposed security model. As a proof of concept, we implement a
prototype of our proposed authorized duplicate check scheme and conduct test bed
experiments using our prototype. We show that our proposed authorized duplicate
check scheme incurs minimal overhead compared to normal operations.

iv
TABLE OF CONTENTS

CHAPTER NO. TITLE PAGE NO.

ABSTRACT iv
LIST OF FIGURES ix
LIST OF ABBREVIATION x

1 INTRODUCTION 1
1.1 Cloud Computing 1
1.2 Cloud Computing Application 2
2 SYSTEM ANALYSIS 5
2.1 Literature Survey 5
2.1.1 Reducing Impact Of Data Fragmentation
Caused By In-Line Deduplication 5
2.1.2 Flexible Data Access Control Based On
Trust And Reputation In Cloud Computing 6
2.1.3 Improved Proxy Re-Encryption Schemes With
Application To Secure Distributed Storage 7
2.1.4 A Hybrid Cloud Approach For Secure
Authorized Deduplication 8
2.1.5 Secure Cloud Data Deduplication With Efficient
Re-Encryption 9
2.1.6 Lou. Secure Deduplicationwith Efficient
And Reliable Convergent Key Management 10
2.1.7 Dupless: Serveraided Encryption For
Deduplicated Storage 11

2.1.8 Survey On Secret Sharing Scheme With

v
Deduplication In Cloud Computing 12
2.1.9 A Secure Data Dedupliction Scheme For
Cloud Storage 13
2.2 Existing System 14
2.3 Proposed System 15
3 SYSTEM ARCHITECTURE 16
4 SYSTEM SPECIFICATION 17

4.1 Hardware Requirements 17


4.2 Software Requirements 17

5 SOFTWARE DESCRIPTION 18
5.1 Java 18

5.2 The Java Platform 20


5.3 What Can Java Technology Do? 21
5.4 How Will Java Technology Change My Life? 23
5.5 ODBC 24
5.6 JDBC 26
5.7 JDBC Goals 26
5.8 Networking Tcp/Ip Stack 29
5.9 Jfree Chart 32
5.9.1. Map Visualizations 32
5.9.2. Time Series Chart Interactivity 33
5.9.3. Dashboards 33
5.9.4. Property Editors 33
6 SYSTEM / SUBSYSTEM SPECIFICATION 34
6.1 Modules 34
6.1.1 Registration 34

vi
6.1.2 File Upload 34
6.1.3 Checking for duplication 34
6.1.4 Naive Bayes Classifier 35
6.1.5 AES Encryption 35
6.2 System Description 35
6.2.1 Registration 35
6.2.1.1 File Upload 35
6.2.1.2 Naïve Bayes Classifier 36
6.2.1.3 Aes Encryption 36
7 SYSTEM DESIGN 37
7.1 Data Flow Diagram 37
7.2 Usecase Diagram 38
7.3 Sequence Diagram 40
7.4 Activity Diagram 42
8 SYSTEM TESTING 44
8.1 System Testing 44
8.2 Performance Testing 47
8.3 Security Testing 48
8.4 White Box Testing 49
8.5 Black Box Testing 50
9 SYSTEM IMPLEMENTATION 52
9.1 User Training 52
9. 2 Training on the Application Software 52
9.3 Operational Documentation 52
9.4 System Maintenance 53
9.5 Corrective Maintenance 54

vii
9.6 Adaptive Maintenance 54
9.7 Perceptive Maintenance 55
9.8 Preventive Maintenance 55
10 CONCLUSION 57

APPENDIX-1 58

APPENDIX-2

REFERENCES 72

viii
LIST OF FIGURES

FIGURE NO NAME PAGE. No

1. System Architecture 16
2. Data Flow Diagram 37
3. Use Case Diagram 39
4. Sequence Diagram 42
5. Activity Diagram 43

ix
LIST OF ABBREVIATIONS

1. DAB Digital Audio Broadcasting.


2 GPS Global Positioning System.
3. NDT Non Destructive Testing.
4. WSN Wireless Sensor Network.
5. SHM Structural Health Monitoring.
6. MPPT Maximum Power Point Tracking.
7. LMS Least Mean Square.
8. JVM Java Virtual Machine.
9. JRE Java Runtime Environment.
10. DFD Data Flow Diagram.
11. UML Unified Modeling Language.

x
CHAPTER - 1

INTRODUCTION

1.1 CLOUD COMPUTING

Cloud computing has provision progressed from a bold vision to massive


deployments in various application domains. However, the complexity of
technology underlying cloud computing introduces novel security risks and
challenges. From an end-user point of view the security of cloud infrastructure
implies unquestionable trust in the cloud provider, in some cases corroborated
by reports of external auditors. While providers may offer security
enhancements such as protection of data at rest, end-users have limited or no
control over such mechanisms.

There is a clear need for usable and cost-effective cloud platform security
mechanisms suitable for organizations that rely on cloud infrastructure. One
such mechanism is platform integrity verification for compute hosts that support
the virtualized cloud infrastructure. Several large cloud vendors have signaled
practical implementations of this mechanism, primarily to protect the cloud
infrastructure from insider threats and advanced persistent threats. We see two
major improvement vectors regarding these implementations.

First, details of such proprietary solutions are not disclosed and can thus not
be implemented and improved by other cloud platforms. Second, to the best of
our knowledge, none of the solutions provides cloud tenants a proof regarding
the integrity of compute hosts supporting their slice of the cloud infrastructure.
To address this, we propose a set of protocols for trusted launch of virtual
machines (VM) in IaaS, which provide tenants with a proof that the requested
VM instances were launched on a host with an expected software stack.

1
Another relevant security mechanism is encryption of virtual disk
volumes, implemented and enforced at compute host level. While support data
encryption at rest is offered by several cloud providers and can be configured by
tenants in their VM instances, functionality and migration capabilities of such
solutions are severely restricted. In most cases cloud providers maintain and
manage the keys necessary for encryption and decryption of data at rest. This
further convolutes the already complex data migration procedure between
different cloud providers, disadvantaging tenants through a new variation of
vendor lock-in.

1.2 CLOUD COMPUTING APPLICATIONS:

Photo editing software:

Picnik, Pixlretc are popular free online photo editing software. This
online software has features such as cropping of image, resizing, rotation based
on degrees, special effects, addition and editing features are also included in a
GUI (Graphical User Interface) format. Some of them offer paint tools and other
adjustment features. The brightness and contrast can also be editable; and users
can layer the images. In case of Pixlr, though it has various high-level, complex
features, still it's easy to use.

Infrastructure as a service (IaaS) and platform as a service (PaaS)

When it comes to IaaS, using an existing infrastructure on a pay-per-use


scheme seems to be an obvious choice for companies saving on the cost of
investing to acquire, manage and maintain an IT infrastructure. There are also
instances where organizations turn to PaaS for the same reasons while also
seeking to increase the speed of development on a ready-to-use platform to
deploy applications.

2
Private cloud and hybrid cloud

Among the many incentives for using cloud, there are two situations
where organizations are looking into ways to assess some of the applications
they intend to deploy into their environment through the use of a cloud
(specifically a public cloud). While in the case of test and development it may
be limited in time, adopting a hybrid cloud approach allows for testing
application workloads, therefore providing the comfort of an environment
without the initial investment that might have been rendered useless should the
workload testing fail. Another use of hybrid cloud is also the ability to expand
during periods of limited peak usage, which is often preferable to hosting a
large infrastructure that might seldom be of use. An organization would seek to
have the additional capacity and availability of an environment when needed on
a pay-as you-go basis.

Test and development

Probably the best scenario for the use of a cloud is a test and development
environment. This entails securing a budget, setting up your environment
through physical assets, significant manpower and time. Then comes the
installation and configuration of your platform. All this can often extend the
time it takes for a project to be completed and stretch your milestones. With
cloud computing, there are now readily available environments tailored for your
needs at your fingertips. This often combines, but is not limited to, automated
provisioning of physical and virtualized resources.

Big data analytics

One of the aspects offered by leveraging cloud computing is the ability to


tap into vast quantities of both structured and unstructured data to harness the
benefit of extracting business value. Retailers and suppliers are now extracting
information derived from consumers‘ buying patterns to target their advertising

3
and marketing campaigns to a particular segment of the population. Social
networking platforms are now providing the basis for analytics on behavioral
patterns that organizations are using to derive meaningful information.

File storage

Cloud can offer you the possibility of storing your files and accessing,
storing and retrieving them from any web-enabled interface. The web services
interfaces are usually simple. At any time and place you have high availability,
speed, scalability and security for your environment. In this scenario,
organizations are only paying for the amount of storage they are actually
consuming, and do so without the worries of overseeing the daily maintenance
of the storage infrastructure.

Disaster recovery

This is yet another benefit derived from using cloud based on the cost
effectiveness of a disaster recovery (DR) solution that provides for a faster
recovery from a mesh of different physical locations at a much lower cost that
the traditional DR site with fixed assets, rigid procedures and a much higher
cost.

4
CHAPTER – 2

SYSTEM ANALYSIS

2.1 LITERATURE SURVEY

2.1.1 REDUCING IMPACT OF DATA FRAGMENTATION CAUSED BY


IN-LINE DEDUPLICATION

AUTHOR: Michal kaczmarczyk, Marcin Barczynski, Cezary dubnicki

YEAR: 2012

De duplication results inevitably in data fragmentation, because logically


continuous data is scattered across many disk locations. In this work we focus
on fragmentation caused by duplicates from previous backups of the same
backup set, since such duplicates are very common due to repeated full backups
containing a lot of unchanged data. All of this is achieved with only small
increase in writing time, between 1% and 5%. Since we rewrite only few
duplicates and old copies of rewritten data are removed in the background, the
whole process introduces small and temporary space overhead). An increased
storage consumption because of space needed for data before de duplication

Advantage

 To increasing the restored speed.

Disadvantage

 Increased storage.

5
2.1.2 FLEXIBLE DATA ACCESS CONTROL BASED ON TRUST AND
REPUTATION IN CLOUD COMPUTING.

AUTHORS: Zheng yan, Xueyun li, Mingjun wang, V.vasilakos,

YEAR: 2015

Cloud computing offers a new way of services and has become a popular
service platform. Storing user data at a cloud data centre greatly releases storage
burden of user devices and brings access convenience. Due to distrust in cloud
service providers, users generally store their crucial data in an encrypted form.
But in many cases, the data need to be accessed by other entities for fulfilling an
expected service, e.g., an eHealth service. In this paper, we propose a scheme to
control data access in cloud computing based on trust evaluated by the data
owner and/or reputations generated by a number of reputation centers in a
flexible manner by applying Attribue-Based Encryption and Proxy Re-
Encryption The main drawback of this approach is that the number of keys
managed by the data owner grows linearly with the number of data-groups.

Advantage

 Data privacy.

Disadvantage

 The number of keys managed by the data owner.

6
2.1.3 IMPROVED PROXY RE-ENCRYPTION SCHEMES WITH
APPLICATION TO SECURE DISTRIBUTED STORAGE.

AUTHORS: Giuseppe ateniese, Kevin fu, Matthew Green, Susan


Hohenberger

YEAR: 2006

In 1998, Blaze, Bleumer, and Strauss (BBS) proposed an application


called atomic proxy re-encryption, in which a semi-trusted proxy converts a
cipher text for Alice into a cipher text for Bob without seeing the underlying
plaintext. We predict that fast and secure re-encryption will become
increasingly popular as a method for managing encrypted file systems.
Although efficiently computable, the wide-spread adoption of BBS re-
encryption has been hindered by considerable security risks. The scheme is only
useful when the trust relationship between Alice and Bob is mutual. Achieving
security based on the definition of the re-encryption key generation function RG
as originally stated is very difficult to realize.

Advantage

 Reduce the data duplication rate.

Disadvantage

 OLB algorithm is used which keep the node busy.

7
2.1.4 A HYBRID CLOUD APPROACH FOR SECURE AUTHORIZED
DEDUPLICATION.

AUTHORS: Jin Li,Yan Kil,Xiafeng chen,Partick, Lee.

YEAR: 2015

Data deduplication is one of important data compression techniques for


eliminating duplicate copies of repeating data, and has been widely used in
cloud storage to reduce the amount of storage space and save bandwidth. To
protect the confidentiality of sensitive data while supporting deduplication, the
convergent encryption technique has been proposed to encrypt the data before
outsourcing. To better protect data security, this paper makes the first attempt to
formally address the problem of authorized data deduplication. Thus,
convergent encryption allows the cloud to perform de duplication on the cipher
texts and the proof of ownership prevents the unauthorized user to access the
file.

Advantage

 To protect the data confidentiality.

Disadvantage

 Previous de-duplication system can not support.


 Differential authorization duplication check.

8
2.1.5 SECURE CLOUD DATA DEDUPLICATION WITH EFFICIENT
RE-ENCRYPTION

AUTHOR: Shunrong Jiang ; Tao Jiang ; Liangmin Wang

YEAR: 2019

Data deduplication has been widely used in cloud storage to reduce


storage space and communication overhead by eliminating redundant data and
storing only one copy for them. In order to achieve secure data deduplication,
the convergent encryption scheme and many of its variants are proposed.
However, most of these schemes do not consider or cannot address the
efficiently dynamic ownership changes and the secure Proof-of-Ownership
(PoW), simultaneously. In this paper, we propose a secure data deduplication
scheme with efficient PoW process for dynamic ownership management.
Specially, our scheme supports both cross-user file-level and inside-user block-
level data deduplication. During the file-level deduplication, we construct a new
PoW scheme to ensure the tag consistency and achieve the mutual ownership
verification. Moreover, we design a lazy update strategy to achieve efficient
ownership management. For inside-user block-level deduplication, the user-
aided key is used to realize convergent key management and reduce the key
storage space. Finally, the security and performance analysis demonstrate that
our scheme can ensure data confidentiality and tag consistency, and it is
efficient in data ownership management.

Advantage

 Overcome stub reserved attack.


 Data privacy.

Disadvantage

 Restore speed is very low.

9
2.1.6 LOU. SECURE DEDUPLICATIONWITH EFFICIENT AND RELIABLE
CONVERGENT KEY MANAGEMENT.

AUTHOR: J. Li, X. Chen, M. Li, J. Li, P. Lee, and W.

YEAR: 2013

Data deduplication is a technique for eliminating duplicate copies of data,


and has been widely used in cloud storage to reduce storage space and upload
bandwidth. Promising as it is, an arising challenge is to perform secure
deduplication in cloud storage. Although convergent encryption has been
extensively adopted for secure deduplication, a critical issue of making
convergent encryption practical is to efficiently and reliably manage a huge
number of convergent keys. This paper makes the first attempt to formally
address the problem of achieving efficient and reliable key management in
secure deduplication. We first introduce a baseline approach in which each user
holds an independent master key for encrypting the convergent keys and
outsourcing them to the cloud. However, such a baseline key management
scheme generates an enormous number of keys with the increasing number of
users and requires users to dedicatedly protect the master key.As a proof of
concept, we implement Dekey using the Ramp secret sharing scheme and
demonstrate that Dekey incurs limited overhead in realistic environments.
Advantage

 Reduce storage space&bandwidth


 Efficient
 Provide confidentiality

Disadvantage

 No convergent key share across multiple server

10
2.1.7 DUPLESS: SERVERAIDED ENCRYPTION FOR DEDUPLICATED

STORAGE.

AUTHOR: M. Bellare, S. Keelveedhi, and T. Ristenpart.

YEAR: 2013.

Cloud storage service providers such as Dropbox, Mozy, and others


perform deduplication to save space by only storing one copy of each file
uploaded. Should clients conventionally encrypt their files, however, savings
are lost. Message-locked encryption (the most prominent manifestation of
which is convergent encryption) resolves this tension. However it is inherently
subject to brute-force attacks that can recover files falling into a known set. We
propose an architecture that provides secure deduplicated storage resisting
brute-force attacks, and realize it in a system called DupLESS. In DupLESS,
clients encrypt under message-based keys obtained from a key-server via an
oblivious PRF protocol. It enables clients to store encrypted data with an
existing service, have the service perform deduplication on their behalf, and yet
achieves strong confidentiality guarantees. We show that encryption for
deduplicated storage can achieve performance and space savings close to that of
using the storage service with plaintext data.
Advantage

 Security
 High performance

Disadvantage

 Large storage inte

11
2.1.8 SURVEY ON SECRET SHARING SCHEME WITH DEDUPLICATION
IN CLOUD COMPUTING

Author: Dharani P,Berlinn M.A.

YEAR: 2015

Data de-duplication is one of the techniques used for eliminating


duplicate copies of data which is widely used in cloud to reduce storage space
and increase bandwidth. Convergent encryption has been extensively adopted
for secure de-duplication, in order to use efficiently and reliably manage a huge
number of convergent keys. A baseline approach named as Dekey is used to
distribute the convergent key which would be shared across multiple servers.
But implementation of Dekey using the Ramp secret sharing scheme has some
limitations; a heavy computational cost is required to make n shares and recover
the secret as a solution to this problem. Hence a new(K, L, n)-threshold ramp
scheme(extension of existing ramp scheme) is proposed which is perfect, idle
and faster secret sharing scheme, every combination of k or more participants
can recover the secret, but every group of less than l participants cannot obtain
any information about the secret.

Advantage

 Widely used in cloud to reduce storage space.


 Increase bandwidth.

Disadvantage

 Every group of less than k participants cannot obtain any


information about the secret.

12
2.1.9 A SECURE DATA DEDUPLICTION SCHEME FOR CLOUD
STORAGE

AUTHOR: Jan Stanek,Alessandro Sorniotti,Elli Androulaki

YEAR: 2013

Nowadays, more and more corporate and private users outsource their
data to cloud storage providers. At the same time, recent data breach incidents
make end-to-end encryption an increasingly prominent requirement. The
effective stage optimization techniques, such as data deduplication, completely
ineffective. In this paper, we present a novel encryption scheme that guarantees
semantic security for unpopular data and provides weaker security and better
storage and bandwidth benefits for popular data. This way, data deduplication
can be effective for popular data, whilst semantically secure encryption protects
unpopular content, preventing its deduplication. Transitions from one mode to
the other take place seamlessly at the storage server side if and only if a file
becomes popular. We show that our scheme is secure under the Symmetric
External Decisional Diffie-Hellman Assumption in the random oracle model,
and evaluate its performance with benchmarks and simulations.

Advantage

 Way to reduce storage costs.

Disadvantage

 Security risks and attack scenarios from both inside and


outside adversaries.

13
2.2 EXISTING SYSTEM

Data deduplication is one of important data compression techniques for

eliminating duplicate copies of repeating data, and has been widely used in

cloud storage to reduce the amount of storage space and save bandwidth.

To protect the confidentiality of sensitive data while supporting

deduplication, Cloud computing provides seemingly unlimited ―virtualized‖

resources to users as services across the whole Internet, while hiding platform

and implementation details.

Today‘s cloud service providers offer both highly available storage and

massively parallel computing resources at relatively low costs.

As cloud computing becomes prevalent, an increasing amount of data is

being stored in the cloud and shared by users with specified privileges, which

define the access rights of the stored data.

Disadvantage

 One critical challenge of cloud storage services is the management of the

ever-increasing volume of data.

14
2.3 PROPOSED SYSTEM
The convergent encryption technique has been proposed to encrypt the
data before outsourcing. To better protect data security, this paper makes the
first attempt to formally address the problem of authorized data deduplication.
Different from traditional deduplication systems, the differential
privileges of users are further considered in duplicate check besides the data
itself.
We also present several new deduplication constructions supporting
authorized duplicate check in hybrid cloud architecture.
Security analysis demonstrates that our scheme is secure in terms of the
definitions specified in the proposed security model.
As a proof of concept, we implement a prototype of our proposed
authorized duplicate check scheme and conduct testbed experiments using our
prototype.
We show that our proposed authorized duplicate check scheme incurs
minimal overhead compared to normal operations.
Advantage
 One critical challenge of cloud storage services is the management of the
ever-increasing volume of data.

15
CHAPTER – 3

SYSTEM ARCHITECTURE

Fig. 3.1 System Architecture

16
CHAPTER – 4

SYSTEM SPECIFICATION

4.1 Hardware Requirements

 Processor - Pentium –IV

 Speed - 1.1 GHz

 RAM - 2 GB (min)

 Hard Disk - 300 GB

 Floppy Drive - 1.44 MB

 Key Board - Standard Windows Keyboard

 Mouse - Two Button Mouse

 Monitor - SVGA or HDMI

4.2 Software Requirements

 Operating System : Windows 7

 Front End : JAVA JDK 1.7

 Back End : MYSQL Server

 Server : Apache Tomact Server

 Script : JSP Script

 Document : MS-Office 2007

17
CHAPTER – 5

SOFTWARE DESCRIPTION

5.1 JAVA

JAVA TECHNOLOGY:
Java technology is both a programming language and a platform.

The Java Programming Language

The Java programming language is a high-level language that can be

characterized by all of the following buzzwords:

 Simple
 Architecture neutral
 Object oriented
 Portable
 Distributed
 High performance
 Interpreted
 Multithreaded
 Robust
 Dynamic
 Secure
With most programming languages, you either compile or interpret a
program so that you can run it on your computer. The Java programming
language is unusual in that a program is both compiled and interpreted. With the
compiler, first you translate a program into an intermediate language called
Java byte codes —the platform-independent codes interpreted by the interpreter
on the Java platform. The interpreter parses and runs each Java byte code
instruction on the computer. Compilation happens just once; interpretation

18
occurs each time the program is executed. The following figure illustrates how
this works.

You can think of Java byte codes as the machine code instructions for the
Java Virtual Machine (Java VM). Every Java interpreter, whether it‘s a
development tool or a Web browser that can run applets, is an implementation
of the Java VM. Java byte codes help make ―write once, run anywhere‖
possible. You can compile your program into byte codes on any platform that
has a Java compiler.

The byte codes can then be run on any implementation of the Java VM.
That means that as long as a computer has a Java VM, the same program written
in the Java programming language can run on Windows 2000, a Solaris
workstation, or on an iMac.

19
5.2 THE JAVA PLATFORM:
A platform is the hardware or software environment in which a program
runs. We‘ve already mentioned some of the most popular platforms like
Windows 2000, Linux, Solaris, and MacOS. Most platforms can be described as
a combination of the operating system and hardware. The Java platform differs
from most other platforms in that it‘s a software-only platform that runs on top
of other hardware-based platforms.

The Java platform has two components:


 The Java Virtual Machine (Java VM)
 The Java Application Programming Interface (Java API)
You‘ve already been introduced to the Java VM. It‘s the base for the Java
platform and is ported onto various hardware-based platforms.
The Java API is a large collection of ready-made software components
that provide many useful capabilities, such as graphical user interface (GUI)
widgets. The Java API is grouped into libraries of related classes and interfaces;
these libraries are known as packages. The next section, What Can Java
Technology Do? Highlights what functionality some of the packages in the Java
API provide.
The following figure depicts a program that‘s running on the Java
platform. As the figure shows, the Java API and the virtual machine insulate the
program from the hardware.

Native code is code that after you compile it, the compiled code runs on a
specific hardware platform. As a platform-independent environment, the Java

20
platform can be a bit slower than native code. However, smart compilers, well-
tuned interpreters, and just-in-time byte code compilers can bring performance
close to that of native code without threatening portability.

5.3 WHAT CAN JAVA TECHNOLOGY DO?


The most common types of programs written in the Java programming
language are applets and applications. If you‘ve surfed the Web, you‘re
probably already familiar with applets. An applet is a program that adheres to
certain conventions that allow it to run within a Java-enabled browser.

However, the Java programming language is not just for writing cute,
entertaining applets for the Web. The general-purpose, high-level Java
programming language is also a powerful software platform. Using the
generous API, you can write many types of programs.

An application is a standalone program that runs directly on the Java


platform. A special kind of application known as a server serves and supports
clients on a network. Examples of servers are Web servers, proxy servers, mail
servers, and print servers. Another specialized program is a servlet.

A servlet can almost be thought of as an applet that runs on the server


side. Java Servlets are a popular choice for building interactive web
applications, replacing the use of CGI scripts. Servlets are similar to applets in
that they are runtime extensions of applications. Instead of working in browsers,
though, servlets run within Java Web servers, configuring or tailoring the
server.

How does the API support all these kinds of programs? It does so with
packages of software components that provides a wide range of functionality.

21
Every full implementation of the Java platform gives you the following
features:

 The essentials: Objects, strings, threads, numbers, input and output, data
structures, system properties, date and time, and so on.
 Applets: The set of conventions used by applets.
 Networking: URLs, TCP (Transmission Control Protocol), UDP (User
Data gram Protocol) sockets, and IP (Internet Protocol) addresses.
 Internationalization: Help for writing programs that can be localized for
users worldwide. Programs can automatically adapt to specific locales and be
displayed in the appropriate language.
 Security: Both low level and high level, including electronic signatures,
public and private key management, access control, and certificates.
 Software components: Known as Java BeansTM, can plug into existing
component architectures.
 Object serialization: Allows lightweight persistence and communication
via Remote Method Invocation (RMI).
 Java Database Connectivity (JDBCTM): Provides uniform access to a
wide range of relational databases.
The Java platform also has APIs for 2D and 3D graphics, accessibility,
servers, collaboration, telephony, speech, animation, and more. The following
figure depicts what is included in the Java 2 SDK.

22
5.4 HOW WILL JAVA TECHNOLOGY CHANGE MY LIFE?
We can‘t promise you fame, fortune, or even a job if you learn the Java
programming language. Still, it is likely to make your programs better and
requires less effort than other languages. We believe that Java technology will
help you do the following:
 Get started quickly: Although the Java programming language is a
powerful object-oriented language, it‘s easy to learn, especially for
programmers already familiar with C or C++.
 Write less code: Comparisons of program metrics (class counts, method
counts, and so on) suggest that a program written in the Java programming
language can be four times smaller than the same program in C++.
 Write better code: The Java programming language encourages good
coding practices, and its garbage collection helps you avoid memory leaks. Its
object orientation, its JavaBeans component architecture, and its wide-ranging,
easily extendible API let you reuse other people‘s tested code and introduce
fewer bugs.
 Develop programs more quickly: Your development time may be as
much as twice as fast versus writing the same program in C++. Why? You write
fewer lines of code and it is a simpler programming language than C++.
 Avoid platform dependencies with 100% Pure Java: You can keep
your program portable by avoiding the use of libraries written in other

23
languages. The 100% Pure JavaTM Product Certification Program has a
repository of historical process manuals, white papers, brochures, and similar
materials online.
 Write once, run anywhere: Because 100% Pure Java programs are
compiled into machine-independent byte codes, they run consistently on any
Java platform.
 Distribute software more easily: You can upgrade applets easily from a
central server. Applets take advantage of the feature of allowing new classes to
be loaded ―on the fly,‖ without recompiling the entire program.
5.5 ODBC:
Microsoft Open Database Connectivity (ODBC) is a standard
programming interface for application developers and database systems
providers. Before ODBC became a de facto standard for Windows programs to
interface with database systems, programmers had to use proprietary languages
for each database they wanted to connect to. Now, ODBC has made the choice
of the database system almost irrelevant from a coding perspective, which is as
it should be. Application developers have much more important things to worry
about than the syntax that is needed to port their program from one database to
another when business needs suddenly change.
Through the ODBC Administrator in Control Panel, you can specify the
particular database that is associated with a data source that an ODBC
application program is written to use. Think of an ODBC data source as a door
with a name on it. Each door will lead you to a particular database. For
example, the data source named Sales Figures might be a SQL Server database,
whereas the Accounts Payable data source could refer to an Access database.
The physical database referred to by a data source can reside anywhere on the
LAN.

24
The ODBC system files are not installed on your system by Windows 95.
Rather, they are installed when you setup a separate database application, such
as SQL Server Client or Visual Basic 4.0. When the ODBC icon is installed in
Control Panel, it uses a file called ODBCINST.DLL. It is also possible to
administer your ODBC data sources through a stand-alone program called
ODBCADM.EXE. There is a 16-bit and a 32-bit version of this program and
each maintains a separate list of ODBC data sources.

From a programming perspective, the beauty of ODBC is that the


application can be written to use the same set of function calls to interface with
any data source, regardless of the database vendor. The source code of the
application doesn‘t change whether it talks to Oracle or SQL Server. We only
mention these two as an example. There are ODBC drivers available for several
dozen popular database systems. Even Excel spreadsheets and plain text files
can be turned into data sources. The operating system uses the Registry
information written by ODBC Administrator to determine which low-level
ODBC drivers are needed to talk to the data source (such as the interface to
Oracle or SQL Server). The loading of the ODBC drivers is transparent to the
ODBC application program. In a client/server environment, the ODBC API
even handles many of the network issues for the application programmer.
The advantages of this scheme are so numerous that you are probably
thinking there must be some catch. The only disadvantage of ODBC is that it
isn‘t as efficient as talking directly to the native database interface. ODBC has
had many detractors make the charge that it is too slow. Microsoft has always
claimed that the critical factor in performance is the quality of the driver
software that is used. In our humble opinion, this is true. The availability of
good ODBC drivers has improved a great deal recently. And anyway, the
criticism about performance is somewhat analogous to those who said that
compilers would never match the speed of pure assembly language. Maybe not,

25
but the compiler (or ODBC) gives you the opportunity to write cleaner
programs, which means you finish sooner. Meanwhile, computers get faster
every year.
5.6 JDBC:
In an effort to set an independent database standard API for Java; Sun
Microsystems developed Java Database Connectivity, or JDBC. JDBC offers a
generic SQL database access mechanism that provides a consistent interface to
a variety of RDBMSs. This consistent interface is achieved through the use of
―plug-in‖ database connectivity modules, or drivers. If a database vendor
wishes to have JDBC support, he or she must provide the driver for each
platform that the database and Java run on.
To gain a wider acceptance of JDBC, Sun based JDBC‘s framework on
ODBC. As you discovered earlier in this chapter, ODBC has widespread
support on a variety of platforms. Basing JDBC on ODBC will allow vendors to
bring JDBC drivers to market much faster than developing a completely new
connectivity solution.
JDBC was announced in March of 1996. It was released for a 90 day
public review that ended June 8, 1996. Because of user input, the final JDBC
v1.0 specification was released soon after.
The remainder of this section will cover enough information about JDBC
for you to know what it is about and how to use it effectively. This is by no
means a complete overview of JDBC. That would fill an entire book.
5.7 JDBC Goals:
Few software packages are designed without goals in mind. JDBC is one
that, because of its many goals, drove the development of the API. These goals,
in conjunction with early reviewer feedback, have finalized the JDBC class
library into a solid framework for building database applications in Java.

26
The goals that were set for JDBC are important. They will give you some
insight as to why certain classes and functionalities behave the way they do. The
eight design goals for JDBC are as follows:
SQL Level API
The designers felt that their main goal was to define a SQL interface for
Java. Although not the lowest database interface level possible, it is at a low
enough level for higher-level tools and APIs to be created. Conversely, it is at a
high enough level for application programmers to use it confidently. Attaining
this goal allows for future tool vendors to ―generate‖ JDBC code and to hide
many of JDBC‘s complexities from the end user.
SQL Conformance
SQL syntax varies as you move from database vendor to database vendor.
In an effort to support a wide variety of vendors, JDBC will allow any query
statement to be passed through it to the underlying database driver. This allows
the connectivity module to handle non-standard functionality in a manner that is
suitable for its users.
JDBC must be implemental on top of common database interfaces
The JDBC SQL API must ―sit‖ on top of other common SQL level APIs.
This goal allows JDBC to use existing ODBC level drivers by the use of a
software interface. This interface would translate JDBC calls to ODBC and vice
versa.
Provide a Java interface that is consistent with the rest of the Java system
Because of Java‘s acceptance in the user community thus far, the designers
feel that they should not stray from the current design of the core Java system.

1. Keep it simple
This goal probably appears in all software design goal listings. JDBC is
no exception. Sun felt that the design of JDBC should be very simple, allowing

27
for only one method of completing a task per mechanism. Allowing duplicate
functionality only serves to confuse the users of the API.

2. Use strong, static typing wherever possible


Strong typing allows for more error checking to be done at compile time;
also, less error appear at runtime.

3. Keep the common cases simple


Because more often than not, the usual SQL calls used by the
programmer are simple SELECT‘s, INSERT‘s, DELETE‘s and UPDATE‘s,
these queries should be simple to perform with JDBC. However, more complex
SQL statements should also be possible.
Finally we decided to precede the implementation using Java
Networking. And for dynamically updating the cache table we go for MS
Access database.

Java ha two things: a programming language and a platform.

Java is a high-level programming language that is all of the

following Simple Architecture-neutral

Object-oriented Portable

Distributed High-performance

Interpreted Multithreaded

Robust Dynamic Secure

Java is also unusual in that each Java program is both compiled and
interpreted. With a compile you translate a Java program into an intermediate
language called Java byte codes the platform-independent code instruction is
passed and run on the computer.

28
Compilation happens just once; interpretation occurs each time the
program is executed. The figure illustrates how this works.

Java Program Interpreter

Compilers
My Program

5.8 NETWORKING TCP/IP STACK:


The TCP/IP stack is shorter than the OSI one:

TCP is a connection-oriented protocol; UDP (User Datagram Protocol) is


a connectionless protocol.

29
IP datagram’s:
The IP layer provides a connectionless and unreliable delivery system. It
considers each datagram independently of the others. Any association between
datagram must be supplied by the higher layers. The IP layer supplies a
checksum that includes its own header. The header includes the source and
destination addresses. The IP layer handles routing through an Internet. It is also
responsible for breaking up large datagram into smaller ones for transmission
and reassembling them at the other end.
UDP:
UDP is also connectionless and unreliable. What it adds to IP is a
checksum for the contents of the datagram and port numbers. These are used to
give a client/server model - see later.
TCP:
TCP supplies logic to give a reliable connection-oriented protocol above
IP. It provides a virtual circuit that two processes can use to communicate.
Internet addresses
In order to use a service, you must be able to find it. The Internet uses an
address scheme for machines so that they can be located. The address is a 32 bit
integer which gives the IP address.
Network address:
Class A uses 8 bits for the network address with 24 bits left over for other
addressing. Class B uses 16 bit network addressing. Class C uses 24 bit network
addressing and class D uses all 32.
Subnet address:
Internally, the UNIX network is divided into sub networks. Building 11 is
currently on one sub network and uses 10-bit addressing, allowing 1024
different hosts.

30
Host address:
8 bits are finally used for host addresses within our subnet. This places a
limit of 256 machines that can be on the subnet.
Total address:

The 32 bit address is usually written as 4 integers separated by dots.


Port addresses
A service exists on a host, and is identified by its port. This is a 16 bit
number. To send a message to a server, you send it to the port for that service of
the host that it is running on. This is not location transparency! Certain of these
ports are "well known".
Sockets:
A socket is a data structure maintained by the system to handle network
connections. A socket is created using the call socket. It returns an integer that
is like a file descriptor. In fact, under Windows, this handle can be used with
Read File and Write File functions.
#include <sys/types.h>
#include <sys/socket.h>
int socket(int family, int type, int protocol);
Here "family" will be AF_INET for IP communications, protocol will be
zero, and type will depend on whether TCP or UDP is used. Two processes
wishing to communicate over a network create a socket each. These are similar
to two ends of a pipe - but the actual pipe does not yet exist.

31
5.9 JFREE CHART:
JFreeChart is a free 100% Java chart library that makes it easy for
developers to display professional quality charts in their applications.
JFreeChart's extensive feature set includes:
A consistent and well-documented API, supporting a wide range of chart
types;
A flexible design that is easy to extend, and targets both server-side and
client-side applications;
Support for many output types, including Swing components, image files
(including PNG and JPEG), and vector graphics file formats (including PDF,
EPS and SVG);
JFreeChart is "open source" or, more specifically, free software. It is
distributed under the terms of the GNU Lesser General Public Licence (LGPL),
which permits use in proprietary applications.
5.9.1. Map Visualizations:
Charts showing values that relate to geographical areas. Some examples
include: (a) population density in each state of the United States, (b) income per
capita for each country in Europe, (c) life expectancy in each country of the
world. The tasks in this project include: Sourcing freely redistributable vector
outlines for the countries of the world, states/provinces in particular countries
(USA in particular, but also other areas);
Creating an appropriate dataset interface (plus default implementation), a
rendered, and integrating this with the existing XYPlot class in JFreeChart;
Testing, documenting, testing some more, documenting some more.

32
5.9.2. Time Series Chart Interactivity
Implement a new (to JFreeChart) feature for interactive time series charts
--- to display a separate control that shows a small version of ALL the time
series data, with a sliding "view" rectangle that allows you to select the subset
of the time series data to display in the main chart.
5.9.3. Dashboards
There is currently a lot of interest in dashboard displays. Create a flexible
dashboard mechanism that supports a subset of JFreeChart chart types (dials,
pies, thermometers, bars, and lines/time series) that can be delivered easily via
both Java Web Start and an applet.
5.9.4. Property Editors
The property editor mechanism in JFreeChart only handles a small subset
of the properties that can be set for charts. Extend (or reimplement) this
mechanism to provide greater end-user control over the appearance of the
charts.

33
CHAPTER – 6

SYSTEM / SUBSYSTEM SPECIFICATION

6.1 MODULES
 Registration
 File upload
 Checking for duplication
 Naïve Bayes classifier
 AES Encryption
6.1.1 Registration:
 For the registration of user with identity ID the group manager
randomly selects a number.
 Then the group manager adds into the group user list which will be
used in the traceability phase.
 After the registration, user obtains a private key which will be used for
group signature generation and file decryption.

Registration
Group Manager Group Members

6.1.2 File Upload:


 The data will be upload.
 The data property is especially useful when we expect the delegation to
be efficient and flexible.
6.1.3 Checking for duplication:
 The file upload with to the data cloud storage for distribution during
upload data comparison process is done.
 It using the MD5 hash value for checking file content to avoid
deduplication attack file already present or not.

34
6.1.4 Naive Bayes Classifier:
 Binary classifiers are generated for each class of event using relevant
features for the class and classification algorithm.
 Binary classifiers are derived from the training sample by considering all
classes other than the current class as other.

6.1.5 AES Encryption:


 To protect the client‘s privacy.
 We apply the anonymous AES in branching programs.
 To reduce the decryption complexity due to the use of AES.

6.2 SYSTEM DESCRIPTION


6.2.1 REGISTRATION:
For the registration of user with identity ID the group manager randomly
selects a number. Then the group manager adds into the group user list which
will be used in the traceability phase. After the registration, user obtains a
private key which will be used for group signature generation and file
uploading.
Registration
Group Manager Group Members

6.2.1.1 FILE UPLOAD:


The canonical application is data uploading. The data property is
especially useful when we expect the delegation to be efficient and flexible. The
schemes enable a content provider to share her data in a confidential and
selective way, with a fixed and small cipher text expansion, by distributing to
each authorized user.
The file upload with to the data cloud storage for distribution during
upload data comparison process is done with the using the md5 hash value for
checking file content to avoid de duplication attack file already present or not.

35
6.2.1.2 NAÏVE BAYES CLASSIFIER:
Binary classifiers are generated for each class of event using relevant
features for the class and classification algorithm .Binary classifiers are derived
from the training sample by considering all classes other than the current class
as other, e.g., Normal will consider two classes: normal and other. The purpose
of this phase is to select different features for different classes by applying the
information gain or gain ratio in order to identify relevant features for each
binary classifier.
6.2.1.3 AES ENCRYPTION:
To protect the client‘s privacy, we apply the anonymous AES in
branching programs. To reduce the decryption complexity due to the use of
AES, we apply recently proposed decryption outsourcing with privacy
protection to shift client‘s pairing computation to the cloud server. The
adversary launches Key Generate algorithms to query for as many private keys
as he wants, which correspond to attribute sets A1, . . . ,Aq being disjoint in
chargedby all authorities {Ak }, but none of these keys satisfy T0.Besides, he
also conducts arbitrarily many computations using the public and secret keys
that he has (belonging to compromised authorities).

36
CHAPTER – 7

SYSTEM DESIGN

7.1 DATA FLOW DIAGRAM


A data flow diagram (DFD) is a graphical representation of the "flow" of
data through an information system, modeling its process aspects. A DFD is
often used as a preliminary step to create an overview of the system without
going into great detail, which can later be elaborated. DFDs can also be used for
the visualization of data processing (structured design). A DFD shows what
kind of information will be input to and output from the system, how the data
will advance through the system, and where the data will be stored. It does not
show information about the timing of process or information about whether
processes will operate in sequence or in parallel unlike a flowchart which also
shows this inform

DATA FLOW DIAGRAM:


USER:

UserLogin

Check

No

yes UserRegistration

Insert File

Shared

Find Dedublicates

End Process

37
ADMIN:

Admin Login

Check

No

yes Registration

Send Files

Request Key

End Process

7.2 USECASE DIAGRAM


A use case is a list of steps, typically defining interactions between a role
(known in Unified Modelling Language (UML) as an "actor") and a system, to
achieve a goal. The actor can be a human, an external system, or time.
In systems engineering, use cases are used at a higher level than within
software engineering, often representing missions or stakeholder goals. The
detailed requirements may then be captured in Systems Modelling
Language (SysML) or as contractual statements.

38
USECASE DIAGRAM:

Fig7.2Usecase Diagram

CLASS DIAGRAM:
A class diagram is an illustration of the relationships and source code
dependencies among classes in the Unified Modeling Language (UML).
In this context, a class defines the methods and variables in an object,
which is a specific entity in a program or the unit of code representing that
entity.

39
CLASS DIAGRAM:

Data Storing Centere

Data User 2
Data User 1 Fig7.2.Class Diagram
Sending Data DIAGRAM
7.3 SEQUENCE Receiving
Datas
A Sequence diagram is an interaction diagram that shows how processes
operate with one another and in what order. It is a construct of a Message
Sequence Chart. A sequence diagram shows object interactions arranged in time
sequence.
A sequence diagram shows object interactions arranged in time sequence.
It depicts the objects and classes involved in the scenario and the sequence of
messages exchanged between the objects needed to carry out the functionality
of the scenario. Sequence diagrams are typically associated with use case

40
realizations in the Logical View of the system under development. Sequence
diagrams are sometimes called event diagrams or event scenarios.
 And document how your system will behave in various scenarios
 Validate the model logic of complex operations and functions
A sequence diagram shows, as parallel vertical lines different processes or
objects that live simultaneously, and, as horizontal arrows, the messages
exchanged between them, in the order in which they occur. This allows the
specification of simple runtime scenarios in a graphical manner.
SEQUENCE DIAGRAM:

SENDER USER:

Generate Key View Requst


Login Send Message
user1

message
Enter username and password
message

41
RECEIVER USER:

Login View Encrypted Message Send Key Request LOGOUT


User2

Enter User2 name and password


message

message

Fig7.3 Sequence Diagram

7.4 ACTIVITY DIAGRAM


Activity diagrams are graphical representations of workflows of stepwise
activities and action with support for choice, iteration and concurrency. In
the Unified Modeling Language, activity diagrams are intended to model both
computational and organizational processes Activity diagrams show the overall
follow of control.

SENDER LOGIN:

42
Userlogin

Send Message View Request Send Key Logout

43
User2 login

Receive MessageView Message Send Key Request Logout

RECEIVER LOGIN

44
CHAPTER – 8

SYSTEM TESTING

8.1 SYSTEM TESTING:


Testing is a process of checking whether the developed system is working
according to the original objectives and requirements. It is a set of activities that
can be planned in advance and conducted systematically. Testing is vital to the
success of the system. System testing make logical assumption that if all the
parts of the system are correct, the global will be successfully achieved. In
adequate testing if not testing leads to errors that may not appear even many
months.
This creates two problems, the time lag between the cause and the
appearance of the problem and the effect of the system errors on the files and
records within the system. A small system error can conceivably explode into a
much larger Problem. Effective testing early in the purpose translates directly
into long term cost savings from a reduced number of errors. Another reason for
system testing is its utility, as a user-oriented vehicle before implementation.
The best programs are worthless if it produces the correct outputs.

Description Expected result

Test for application window All the properties of the windows are
properties. to be properly aligned and displayed.
All the mouse operations like click,
Test for mouse operations. drag, etc. must perform the necessary
operations without any exceptions.

UNIT TESTING:
A program represents the logical elements of a system. For a program to
run satisfactorily, it must compile and test data correctly and tie in properly with

45
other programs. Achieving an error free program is the responsibility of the
programmer. Program testing checks for two types of errors: syntax and
logical. Syntax error is a program statement that violates one or more rules of
the language in which it is written. An improperly defined field dimension or
omitted keywords are common syntax errors. These errors are shown through
error message generated by the computer. For Logic errors the programmer
must examine the output carefully.

UNIT TESTING

Description Expected result

Test for application window All the properties of the windows are
properties. to be properly aligned and displayed.
All the mouse operations like click,
Test for mouse operations. drag, etc. must perform the necessary
operations without any exceptions.

FUNCTIONAL TESTING:
Functional testing of an application is used to prove the application
delivers correct results, using enough inputs to give an adequate level of
confidence that will work correctly for all sets of inputs. The functional testing
will need to prove that the application works for each client type and that
personalization function work correctly. When a program is tested, the actual
output is compared with the expected output. When there is a discrepancy the
sequence of instructions must be traced to determine the problem. The process
is facilitated by breaking the program into self-contained portions, each of
which can be checked at certain key points. The idea is to compare program
values against desk-calculated values to isolate the problems.

46
FUNCTIONAL TESTING:

Description Expected result


All peers should communicate
Test for all modules.
in the group.
Test for various peer in a
distributed network framework The result after execution
as it display all users available in should give the accurate result.
the group.

NON-FUNCTIONAL TESTING:
The Non Functional software testing encompasses a rich spectrum of
testing strategies, describing the expected results for every test case. It uses
symbolic analysis techniques. This testing used to check that an application will
work in the operational environment.

Non-functional testing includes:

 Load testing
 Performance testing
 Usability testing
 Reliability testing
 Security testing

LOAD TESTING:
An important tool for implementing system tests is a Load generator. A
Load generator is essential for testing quality requirements such as performance
and stress. A load can be a real load, that is, the system can be put under test to
real usage by having actual telephone users connected to it. They will generate
test input data for system test.

47
LOAD TESTING

Description Expected result


It is necessary to ascertain that
the application behaves
Should designate another active
correctly under loads when
node as a Server.
‗Server busy‘ response is
received.

8.2 PERFORMANCE TESTING:


Performance tests are utilized in order to determine the widely defined
performance of the software system such as execution time associated with
various parts of the code, response time and device utilization. The intent of this
testing is to identify weak points of the software system and quantify its
shortcomings.

PERFORMANCE TESTING:

Description Expected result


This is required to assure that an
application perforce adequately,
Should handle large input
having the capability to handle
values, and produce accurate
many peers, delivering its results in
result in a expected time.
expected time and using an
acceptable level of resource and it
is an aspect of operational
management.

48
RELIABILITY TESTING:
The software reliability is the ability of a system or component to perform
its required functions under stated conditions for a specified period of time and
it is being ensured in this testing. Reliability can be expressed as the ability of
the software to reveal defects under testing conditions, according to the
specified requirements. It the portability that a software system will operate
without failure under given conditions for a given time interval and it focuses
on the behavior of the software element. It forms a part of the software quality
control

RELIABILITY TESTING:

Description Expected result


This is to check that the server is
In case of failure of the
rugged and reliable and can handle
server an alternate server
the failure of any of the components
should take over the job.
involved in provide the application.

8.3 SECURITY TESTING:


Security testing evaluates system characteristics that relate to the
availability, integrity and confidentiality of the system data and services.
Users/Clients should be encouraged to make sure their security needs are very
clearly known at requirements time, so that the security issues can be addressed
by the designers and testers.

49
SECURITY TESTING

Description Expected result


Checking that the user In case failure it should not be
identification is authenticated. connected in the framework.
Check whether group keys in a tree The peers should know group
are shared by all peers. key in the same group.

8.4 WHITE BOX TESTING:


White box testing, sometimes called glass-box testing is a test case
design method that uses the control structure of the procedural design to
derive test cases. Using white box testing method, the software engineer
can derive test cases. The White box testing focuses on the inner structure of
the software structure to be tested.

WHITE BOX TESTING:

Description Expected result


Exercise all logical decisions on All the logical decisions must be
their true and false sides. valid.
Execute all loops at their
boundaries and within their All the loops must be finite.
operational bounds.
Exercise internal data structures All the data structures must be
to ensure their validity. valid.

50
8.5 BLACK BOX TESTING:
Black box testing, also called behavioral testing, focuses on the
functional requirements of the software.
That is, black testing enables the software engineer to derive sets of
input conditions that will fully exercise all functional requirements for a
program. Black box testing is not alternative to white box techniques.
Rather it is a complementary approach that is likely to uncover a
different class of errors than white box methods.
Black box testing attempts to find errors which focuses on inputs,
outputs, and principle function of a software module.
The starting point of the black box testing is either a specification or
code. The contents of the box are hidden and the stimulated software should
produce the desired results.
BLACK BOX TESTING:

Description Expected result


To check for incorrect or missing
All the functions must be valid.
functions.
The entire interface must function
To check for interface errors.
normally.
To check for errors in a data
The database updation and retrieval
structures or external data base
must be done.
access.
All the functions and data structures
To check for initialization and
must be initialized properly and
termination errors.
terminated normally.

51
All the above system testing strategies are carried out in as the
development, documentation and institutionalization of the proposed goals and
related policies is essential.

52
CHAPTER – 9

SYSTEM IMPLEMENTATION

9.1 USER TRAINING


Implementation of software refers to the final installation of the package
in its real environment, to the satisfaction of the intended users and the operation
of the systems. The people are not sure that the software is meant to make their
job easier.
 The active user must be aware of the benefits of using the systems.
 Their confidence in the software built up.
 Proper guidance is impaired to the user so that he is comfortable in using
the application.
Before going ahead and viewing the systems, the user must know that for
viewing the results, the server program should be running in the server. If the
server object is not running on the server, the actual processes will not take
place.

9.2 TRAINING ON THE APPLICATION SOFTWARE


To achieve the objective and benefits expected from the proposed system it
is essential for the people who will be involved to be confident of their role in the
new system. As system becomes more complex, the need for education and
training is more and more important.
Education is complementary to training. It brings life to formal training by
explaining the back ground to the resources for them.
Education involves creating the right atmosphere and motivating user staff.
Education information can make training more interesting and more
understandable.
9.3 OPERATIONAL DOCUMENTATION
After providing the necessary basic training on the computer awareness,
the users will have to be trained on the new application software. This will give
53
the underlying philosophy of the use of the new system such as the screen flow,
screen design, type of help on the screen, type of errors while entering the data,
the corresponding validation check at each entry and the ways to correct the data
entered. This training may be different across different user groups and across
different levels of hierarchy.
Operational maintenance is the care and minor maintenance of equipment
using procedures that do not require detailed technical knowledge of the
equipment‘s or system‘s function and design. This category of operational
maintenance normally consists of inspecting, cleaning, servicing, preserving,
lubricating, and adjusting, as required. Such maintenance may also include
minor parts replacement that does not require the person performing the work to
have highly technical skills or to perform internal alignment.

9.4 SYSTEM MAINTENANCE


Once the implementation plan is decided, it is essential that the user of the
system is made familiar and comfortable with the environment. A documentation
providing the whole operations of the system is being developed.
Useful tips and guidance is given inside the application itself to the user.
The system is developed user friendly so that the user can work the system from
the tips given in the application itself.
The results obtained from the evaluation process help the organization to
determine whether its information systems are effective and efficient or
otherwise. the process of monitoring, evaluating, and modifying of existing
information systems to make required or desirable improvements may be
termed as system maintenance. System maintenance is an ongoing activity,
which covers a wide variety of activities, including removing program and
design errors, updating documentation and test data and updating user support.

54
9.5 CORRECTIVE MAINTENANCE
The maintenance phase of the software cycle is the time in which
software performs useful work. After a system is successfully implemented, it
should be maintained in a proper manner. System maintenance is the important
aspect in the software development life cycle.
The need for system maintenance is to make adaptable to the changes in
the system environment. There may be social, technical and other environment
changes, which affect a system which is being implemented.
Software product enhancements may involve providing new functional
capabilities, improving user displays and mode of interaction, upgrading the
performance characteristics of the system.
So only thru proper system maintenance procedures, the system can be adapted to
cope up with these changes. Software maintenance is of course, far more than
―finding mistakes‖.
This type of maintenance implies removing errors in a program, which
might have crept in the system due to faulty design or wrong assumptions.
Thus, in corrective maintenance, processing or performance failures are
repaired.

9.6 ADAPTIVE MAINTENANCE


The first maintenance activity occurs because it is unreasonable to assume
that software testing will uncover all latent errors in a large software system.
During the use of any large program, errors will occur and be reported to the
developer.
The process that includes the diagnosis and correction of one or more
errors is called Corrective Maintenance.
Adaptive maintenance includes changes to the functionality of the system
developed for specific customer needs. Adaptive maintenance also implies the
need for modifications of certain functionalities, although the system works as

55
expected and in this sense that there is no fault or error in the system. It usually
occurs when there comes to a change in legal norms or a shift in the political
business users.
Such changes usually cause divergence in originally set system and its
parameters, and therefore the need for harmonization and implementation of
new functionalities based on user requests is required.
9.7 PERCEPTIVE MAINTENANCE
The second activity that contributes to a definition of maintenance occurs
because of the rapid change that is encountered in every aspect of computing.
Therefore, adaptive maintenance termed as an activity that modifies software to
properly interfere with a changing environment is both necessary and
commonplace.
Perceptive Process Design is a user-friendly business process modelling
environment that allows business users to diagram, model, display, and document
and publishes business process maps and meets process-specific compliance
requirements.
Perceptive Process Enterprise is a case-based business process management
tool that supports complex case-handling and work process execution and
automation for a variety of industries and organizations.

9.8 PREVENTIVE MAINTENANCE


The third activity that may be applied to a definition of maintenance
occurs when a software package is successful. As the software is used,
recommendations for new capabilities, modifications to existing functions, and
general enhancement are received from users. To satisfy requests in this category,
perceptive maintenance is performed. This activity accounts for the majority of
all efforts expended on software maintenance.
The care and servicing by personnel for the purpose of maintaining
equipment in satisfactory operating condition by providing for systematic

56
inspection, detection, and correction of incipient failures either before they occur
or before they develop into major defects. The work carried out on equipment in
order to avoid its breakdown or malfunction.
It is a regular and routine action taken on equipment in order to prevent its
breakdown. Maintenance, including tests, measurements, adjustments, parts
replacement, and cleaning, performed specifically to prevent faults from
occurring.
Preventive (sometimes called preventative) maintenance is regularly
performed maintenance on a piece of equipment to reduce the likelihood of
failure. Preventive maintenance ensures that anything of value to your
organization receives consistent maintenance to avoid unexpected breakdowns
and costly disruptions.
In the same way you would not wait until your car‘s engine fails to get
the oil changed, machines, equipment, buildings and anything of value to your
organization need consistent maintenance to avoid breakdowns and costly
disruptions.
This work is called Planned or Preventive Maintenance (PM). Preventive
Maintenance is performed while the equipment is operating normally to avoid
the consequences of unexpected breakdowns, such as increased costs, downtime
and more.
PM is a strategy that all companies can implement to move away from
reactive maintenance modes, and to begin a reliability journey. As the best
programs include a combination of maintenance approaches, implementing
preventive maintenance is an important step to the ideal strategy of predictive
maintenance. The third activity that may be applied to a definition of
maintenance occurs when a software package is successful. As the software is
used, recommendations for new capabilities, modifications to existing functions

57
CHAPTER – 10

CONCLUSION

The growing need for secure cloud storage services and the attractive
properties of the convergent cryptography lead us to combine them, thus,
defining an innovative solution to the data outsourcing security and efficiency
issues. Our solution is based on a cryptographic usage of symmetric encryption
used for Meta data files, due to the highest sensibility of this information
towards several instructions.

In addition, thanks to the Merkle tree properties, this proposal is shown to


support data de duplication, as it employs anpre-verfication of data existence, in
cloud servers, which is useful for saving bandwidth, Besides, our solution is
also shown to be resistant to unauthorized access to data and to any data
disclosure during sharing process, providing two levels of access control
verification. Finally, we believe that cloud data storage security is still full of
challenges and of paramount importance and many research problems remain to
be identified.

58
APPENDIX-1
CODING

SAMPLE SOURCE CODE

package com.deduplication;

import

java.io.IOException;

import java.io.PrintWriter;
import static java.lang.System.out;

import java.sql.Connection;

import java.sql.DriverManager;

import java.sql.ResultSet;

import java.sql.Statement;

import javax.servlet.RequestDispatcher;

import javax.servlet.ServletException;

import javax.servlet.http.HttpServlet;

import javax.servlet.http.HttpServletRequest;

import javax.servlet.http.HttpServletResponse;

import javax.servlet.http.HttpSession;

public class UserRegistration extends HttpServlet {

protected void processRequest(HttpServletRequest request,


HttpServletResponse response)throws ServletException, IOException {

response.setContentType(“text/html;charset=UTF-8”);
59
try (PrintWriter out = response.getWriter()) {

out.println(“<!DOCTYPE html>”);

out.println(“<html>”);

out.println(“<head>”);

out.println(“<title>Servlet

UserRegistration</title>”); out.println(“</head>”);

out.println(“<body>”);

out.println(“<h1>Servlet UserRegistration at “
+ request.getContextPath() + “</h1>”);

out.println(“</

body>”);

out.println(“</html>”);

@Override

protected void doGet(HttpServletRequest request, HttpServletResponse


response)

throws ServletException, IOException {

processRequest(request, response);

@Override

60
protected void doPost(HttpServletRequest request, HttpServletResponse
response)

throws ServletException, IOException

{ HttpSession

session1=request.getSession(); Connection

con=null;

Statement st=null;

ResultSet rs1=null;

try

String reguser=request.getParameter(“name”);

String

regpws=request.getParameter(“password”);

String regemail=request.getParameter(“email”);

String regpno=request.getParameter(“mobileno”);

System.out.println(“name”+reguser);

System.out.println(“pws”+regpws);

System.out.println(“mobileno”+regpno);

System.out.println(“mail”+regemail);

Class.forName(“com.mysql.jdbc.Driver”);

61
con=DriverManager.getConnection(“jdbc:mysql://localhost:3306/debuplication
“,”root”,”password”);

62
st=con.createStatement();

int rs=st.executeUpdate(“Insert into registration(username ,password


,email,phoneno)
VALUES(‘”+reguser+”’,’”+regpws+”’,’”+regemail+”’,’”+regpno+”
’)”);

if(rs>0)

response.sendRedirect(“index.jsp”);

else

RequestDispatcher
rd=request.getRequestDispatcher(“RegistarionFrom.jsp”)
;

rd.include(request, response);

out.print(“<br><br><br><h1><center>Sorry UserName or Password


Error!”+”</h1>”);

}}

catch(Exception ex)

ex.printStackTrace();

}}

@Override
63
public String getServletInfo() {

64
return “Short description”;

}// </editor-fold>

package com.deduplication;

import java.io.File;

import java.io.FileInputStream;

import java.io.IOException;

import java.io.InputStream;

import java.math.BigInteger;

import java.security.MessageDigest;

import java.security.NoSuchAlgorithmException;

import java.util.Arrays;

import java.util.logging.Level;

import java.util.logging.Logger;

class mdhashing {

private static final Logger logger =


Logger.getLogger(mdhashing.class.getName());

public static void main(String args[]) {

String file = “C:/temp/abc.txt”;

System.out.println(“MD5 checksum for file using Java : “

+ hecksum(file));

65
// System.out.println(“MD5 checksum of file in Java using
Apache commons codec: “

// + checkSumApacheCommons(file));

} public static String hecksum(String msg)

{ String checksum = null;

try {

MessageDigest md =

MessageDigest.getInstance(“MD5”); byte[]

messageDigest = md.digest(msg.getBytes());

System.out.println(Arrays.toString(messageDigest));

//System.out.println(msg.getBytes());

//System.out.println(messageDigest.toString());

BigInteger number = new BigInteger(1, messageDigest);

System.out.println(number);

String hashtext = number.toString(16);

// Now we need to zero pad it if you actually want the full 32 chars.

While (hashtext.length() < 32) {

hashtext = “0” + hashtext;

return hashtext;

66
catch (NoSuchAlgorithmException e) {

67
throw new RuntimeExceptionI;

// try {

// //FileInputStream fis = new FileInputStream(path);

// MessageDigest md = MessageDigest.getInstance(“MD5”);

// //Using MessageDigest update() method to provide input

// byte[] buffer = new byte[8192];

// int numOfBytesRead;

//// while( (numOfBytesRead = fis.read(buffer)) &gt; 0){

//// md.update(buffer, 0, numOfBytesRead);

//// }

// byte[] hash = md.digest();

// System.out.println(hash);

// checksum = new BigInteger(1, hash).toString(16); //don’t use this,


truncates leading zero

// } catch (IOException ex) {

// logger.log(Level.SEVERE, null, ex);

// } catch (NoSuchAlgorithmException ex) {

// logger.log(Level.SEVERE, null, ex);

// }

//

68
// return checksum;

public static byte[] createSha1(File file) throws Exception

{ MessageDigest digest = MessageDigest.getInstance(“SHA-

1”); InputStream fis = new FileInputStream(file);

int n = 0;

byte[] buffer = new byte[8192];

while (n != -1) {

n = fis.read(buffer);

if (n > 0) {

digest.update(buffer, 0, n);

return digest.digest();

public static String shaa(String pathaa) throws Exception

// String datafile = “c:\\INSTLOG.TXT”;

MessageDigest md = MessageDigest.getInstance(“SHA1”);

69
FileInputStream fis = new FileInputStream(pathaa);

byte[] dataBytes = new byte[1024];

int nread = 0;

while ((nread = fis.read(dataBytes)) != -1) {

md.update(dataBytes, 0, nread);

};

byte[] mdbytes = md.digest();

//convert the byte to hex format

StringBuffer sb = new

StringBuffer(“”); for (int i = 0; i <

mdbytes.length; i++) {

sb.append(Integer.toString((mdbytes[i] & 0xff) + 0x100,


16).substring(1));

System.out.println(“Digest(in hex format):: “ +

sb.toString()); return sb.toString();

}}

70
APPENDIX-2
SCREENSHOTS

71
72
73
74
75
REFERENCES

[1] M.Shyamala Devi, V.Vimal Khanna and A.Naveen Bhalaji ―Enhanced

Dynamic Whole File De Duplication for space optimization in private

Cloud‖ Vol 4, Augest 2014.

[2] A. Pasqual Puzi, B. Refik Molva ―Bloak level De-Duplication with

Encrypted Data‖ OGCC Vol 1, 2014.

[3] Edna Dias Canedo,Rafael Timoteo de sousa, ―Trust Model For

Reliable File Exchange in Cloud Computing‖ Vol. 4, Feb 2012

[4] Jin ji , yan ki li,Patrick P.C ―A hybrid cloud approach for

secure authorized de-duplication‖ IEEE Vol PP No : 99 2014.

[5] Dinesh H.A, Agrawal V.K ―Multilevel accessing technique for cloud

Service‖ Vol 2, 2012.

[6] Seny kamara and Kristin Lauter, ―Cryptographic cloud storage‖ In


international Conference on Financial Cryptography & Data Security, 2020.

[7] M Shaik Saleem and M. Murali, ―Privacy-preserving public auditing for


data integrity in cloud‖, 2008

[8] Konstantinos Christidis and Michael Devetsikiotis, ―Blockchains


and smart contracts for the internet of things‖ IEEE Access, 4:2292–2303,
2016.
[9] Satoshi Nakamoto. Bitcoin, ―A peer-to-peer electronic cash system‖
2008.
[10] Y. Yuan and F. Y. Wang. Blockchain, ―The state of the art and
future trends. Acta Automatica Sinica‖, 2016.
76
[11] Guy Zyskind, Oz Nathan, and Alex ’Sandy’ Pentland,
―Decentralizing Privacy: Using blockchain to protect personal data‖ In IEEE
Security & Privacy Workshops, 2015.
[12] R. S. Wahby, I. Tzialla, J. Thaler, and M. Walfish, ―Doubly-
efficient zksnarks without trusted setup,‖
[13] P. Mizel, F. Raetz, and G. Schmuck, ―Asure: First scalable blockchain
network for decentralized social security systems,‖ 2018.

[14] J. Benet, ―Ipfs-content addressed, versioned, p2p file system,‖ arXiv


preprint arXiv:1407.3561, 2014.

[15] J. Benet and N. Greco, ―Filecoin: A decentralized storage network,‖


Protoc. Labs, 2018.

[16] D. Vorick and L. Champine, ―Sia: Simple decentralized storage,‖


Retrieved May, vol. 8, p. 2018, 2014.
[17] V. Demianets and A. Kanakakis, ―Distributed ledger with secure
data deletion,‖ Tech. Rep. 1.4.

[18] T. Hardjono and N. P. Smith, ―Anonymous identities for


permissioned blockchains,‖ 2016.

[19] D. Baars, ―Towards self-sovereign identity using blockchain


technology,‖ Master‘s thesis, University of Twente, 2016.
[20] S. Noether, A. Mackenzie, et al., ―Ring confidential transactions,‖
Ledger, vol. 1, pp. 1–18, 2016.

77

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy