0% found this document useful (0 votes)

169 views28 pages

Search Pubmed With R Part1Part2

R is a free software environment for statistical computing, data manipulation, calculation and graphical display. The associated Bioconductor project provides many additional R packages for statistical data analysis in different life science areas.

Uploaded by

cpmarqui

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

169 views28 pages

Search Pubmed With R Part1Part2

Uploaded by

cpmarqui

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 28

Search Pubmed with R Part1 and Part2

R Project

R is a free software environment for statistical computing, data manipulation, calculation and graphical display (1,2) For those interested, the associated Bioconductor project provides many additional R packages for statistical data analysis in different life science areas, such as tools for microarray, next generation sequence and genome analysis. The R software is free and runs on all common operating systems (2-4). Facilitates the inclusion of biological metadata from literature data such as PubMed. Provides access to powerful statistical and graphical methods.
References:
1- The R Project for Statistical Computing: http://www.r-project.org/ 2- W. N. Venables, D. M. Smith and the R Development Core Team. An Introduction to RNotes on R: A Programming Environment for Data Analysis and Graphics. Version 2.14.2 (2012-02-29). 3-R & Bioconductor Manual. Author: Thomas Girke, UC. Riversidehttp://manuals.bioinformatics.ucr.edu/home/R_BioCondManual#TOC-R-Basics 4- Bioconductor: http://www.bioconductor.org/

Install R
1- Install the latest release of R according to instructions provided in The R Project for Statistical Computing- http://www.r-project.org/ 2- Onced installed, open the R command window (R console) 3- In the R Console the > prompt in red color is where you type the commands. 4- Any text or comment in R beginning with the hash # symbol is ignored.

References 1- The R Project for Statistical Computing: http://www.r-project.org/ 2- Bioconductor: http://www.bioconductor.org/ 3-R Tutorials. W.B. King. 2010. http://ww2.coastal.edu/kingw/statistics/R-tutorials/preliminaries.html

Install packages in R
1- In the R Console type the following in the R command window to connect to Bioconductor and install packages: source("http://bioconductor.org/biocLite.R") 2- request instalation of the package type: biocLite() 3- Install packages, "RISmed" , and "tm" by typing (see next slide) : biocLite(c("RISmed", "tm")) 3- Install package "ggplot2" -type: biocLite( "ggplot2")) Package RISmed is to download content from NCBI databases. Package tm is for text mining functionalities Package ggplot2 is for data visualization
References 1- Bioconductor: http://www.bioconductor.org/ RISmed package: Stephanie Kovalchik (2013). RISmed: Download content from NCBI databases. R package version 2.1.0. http://CRAN.R-project.org/package=RISmed tm package: Ingo Feinerer and Kurt Hornik (2013). tm: Text Mining Package. R package version 0.5-8.3. http://CRAN.R-project.org/package=tm ggplot2 package: H. Wickham. ggplot2: elegant graphics for data analysis. Springer New York, 2009. http://had.co.nz/ggplot2/book also http://cran.r-project.org/web/packages/ggplot2/index.html

The R Console

Query pubmed titles for oncolytic virus using RISmed

Type the following in the R console: library(RISmed) onc<- EUtilsSummary("oncolytic virus[Majr]") onc # [1] "\"oncolytic viruses\"[MeSH Major Topic]" fetch.onc <- EUtilsGet(onc) fetch.onc # PubMed query: "oncolytic viruses"[MeSH Major Topic] Records: 713 onc.tit<-ArticleTitle(fetch.onc) onc.tit <-unlist(onc.tit) # export title results as text file write(onc.tit, file="title_oncolytic_virus.txt")

Query pubmed MESH topic for oncolytic virus using RISmed

# Continue to type in the R console the following: mh<-Mesh(fetch.onc) mh.per.row<- lapply(1:length(mh), function(i){ mh.df.rbind = as.data.frame(do.call(rbind, Mesh(fetch.onc)[i])) mh.per.row<-paste(mh.df.rbind$Heading, collapse= ";") }) mh.list<-unlist(mh.per.row) # The following is to export mesh results as text file write(mh.list , file="mesh_oncolytic_virus.txt")

View results in excel

# export both title and mesh results as text file to view as table with excel tit.mh<-cbind(onc.tit, mh.list) tit.mh[1:10,] # view first 10 results write.table(tit.mh, file="tit_mesh_oncolytic_virus.txt ", row.names=F, sep="\t") # !!open file in excel

Column containing titles

Column containing corresponding Mesh terms

Preparing forText Mining Analysis

Type getwd() in the R console to display the R working directory. In my case: [1] "C:/Documents and Settings/PMarqui/My Documents" Now create a new folder in the R working directory and give a name to it (for ex. OncolyticVirus) Use the new folder to place two of the recently created text files: title_oncolytic_virus.txt and mesh_oncolytic_virus.txt Start the Text Mining Analysis

Text Mining Analysis

# Type the following in the R Console library(tm) #loads the text mining package my.corpus<-Corpus(DirSource("OncolyticVirus"), readerControl=list(reader=readPlain)) # Note that "OncolyticVirus" refer to the name of the newly created folder. In my.corpus<-Corpus(DirSource(" you must use the name given to the folder containing the 2 text files my.corpus <- tm_map(my.corpus, stripWhitespace) # Removes extra
whitespace

my.corpus <- tm_map(my.corpus, gsub, pattern="[^[:alnum:][:space:]]", replacement=" ") # remove punctuation except dash
"-"

# my.corpus <- tm_map(my.corpus, removeNumbers) # Removes

numbers- optional

Text Mining Analysis

# Continue and type the following code in the R Console:

my.corpus <- tm_map(my.corpus, tolower) #Conversion to lower case letters my.corpus <- tm_map(my.corpus, removeWords, stopwords("english")) # Removes stopwords my.corpus <- tm_map(my.corpus, stemDocument) # removes suffixes from
words to get common origin Document matrix

my.corpus.matrix<-TermDocumentMatrix(my.corpus) # Creates a Termmat.my.corpus<- as.matrix(my.corpus.matrix) # Creates a matrix my.corpus.df<-as.data.frame(mat.my.corpus) # Create data frame from
matrix displaying all the terms in any of the 2 documents.

my.corpus.df[200:250,1:2] # view some of the terms copy.my.corpus.df<-my.corpus.df # make a copy of my.corpus.df

data frame for later

to keep original

Text Mining Analysis

# Continue and type the following code in the R Console:
#sort the most freq mesh term in the data frame my.corpus.df<- my.corpus.df[

order(my.corpus.df$mesh_oncolytic_virus.txt, decreasing = T),]

# assign the 50 most freq mesh term to xx

xx<- my.corpus.df[1:50,]

# view the top 5 most freq mesh term- to view you can also use "head( xx,5)" both are equivalent xx[1:5,] #sort the 50 most freq mesh term in increasing order (for plot visualization) xx<- xx[ order(xx$mesh_oncolytic_virus.txt, decreasing = FALSE),]

Text Mining Analysis

# Continue and type the following code in the R Console:
# Plot the 50 most frequent mesh terms use library ggplot2 library(ggplot2)

Terms<- rownames(xx) Mesh.count<-xx$mesh_oncolytic_virus.txt ggplot(xx) + geom_point(aes(Terms, Mesh.count ), stat = "identity", fill = "darkblue")+ coord_flip() + theme_bw() p1<-last_plot() + scale_x_discrete(limits=(Terms)) p1

Text Mining Analysis

VIEW the 50 most frequent mesh term

Part 2

Text Mining Analysis

# Continue and type the following code in the R Console: now select the most freq title term. Therfore sort title in decreasing order my.corpus.df<- my.corpus.df[ order(my.corpus.df$title_oncolytic_virus.txt, decreasing = T),] xy<- my.corpus.df[1:50,] # assign the 50 most freq title term to xy xy[1:5,] # view the top 5 most freq title term

#sort the 50 most freq title term in increasing order (for plot visualization) xy<- xy[ order(xy$title_oncolytic_virus.txt, decreasing = FALSE),] # Plot the 50 most frequent title terms require(ggplot2)

Terms<- rownames(xx) Title.count<-xy$title_oncolytic_virus.txt

ggplot(xy) + geom_point(aes(Terms, Title.count ), stat = "identity", fill = "darkblue")+ coord_flip() + theme_bw()

p2<-last_plot() + scale_x_discrete(limits=(Terms)) p2

Text Mining Analysis

VIEW the 50 most frequent title term

Text Mining Analysis

# Continue and type the following code in the R Console: Create separate data frames for each frequency type

my.corpus.sub1.df<- subset(my.corpus.df, mesh_oncolytic_virus.txt>0 & title_oncolytic_virus.txt>0) # subset common terms in the 2 documents my.corpus.sub1.df[200:300,1:2] # view some of the subset terms my.corpus.sub2.df<- subset(my.corpus.df, mesh_oncolytic_virus.txt==0 & title_oncolytic_virus.txt>0) # terms present in title but not in mesh my.corpus.sub2.df[200:300,1:2] # to view some terms (200-300) my.corpus.sub3.df<-subset(my.corpus.df, mesh_oncolytic_virus.txt>0 & title_oncolytic_virus.txt==0) # terms present in mesh but not in title my.corpus.sub3.df[200:300,1:2] # view some of the terms

#CORRELATE terms in title and mesh cor(my.corpus.df$title_oncolytic_virus.txt, my.corpus.df$mesh_oncolytic_virus.txt) # correlation coefficient is [1] 0.4442518

Text Mining Analysis

# bellow generates a term frequency vector from a text document termFrequency <-rowSums(as.matrix(my.corpus.matrix)) my.tdm <- TermDocumentMatrix(my.corpus, control = list(minWordLength = 1)) my.tdm #A term-document matrix (2632 terms, 2 documents) # bellow is to select those terms from term-document matrix which occur at least 100 times findFreqTerms(my.tdm[,1], lowfreq=100) findFreqTerms(my.tdm[,2], lowfreq=100)

For part 2

Text Mining Analysis

# Code for plot 3: most frequent title terms with the corresponding mesh terms my.corpus.df<- my.corpus.df[ order(my.corpus.df$title_oncolytic_virus.txt, decreasing = T),] xy<- my.corpus.df[1:50,] # assign the 50 most freq title term to xy #sort the 50 most freq title term in increasing order (for plot visualization) xy<- xy[ order(xy$title_oncolytic_virus.txt, decreasing = FALSE),]

# Plot the 50 most frequent title terms and the corresponding mesh terms included in the 50 most frequent title terms

Terms<- rownames(xx) Title.count<-xy$title_oncolytic_virus.txt Mesh.count<-xy$mesh_oncolytic_virus.txt

ggplot(xy, aes(Terms)) + geom_point(aes(y = Mesh.count, colour = "Mesh.count")) + geom_point(aes(y = Title.count, colour = "Title.count"))

p3<-last_plot() + coord_flip() p3<-last_plot() + scale_x_discrete(limits=(Terms)) p3

plot 3: most frequent title terms with the corresponding mesh terms

Text Mining Analysis

# Code for plot 4: most frequent title terms and

most frequent mesh terms top50.mh.ti<-rbind(xx,xy) # combine top 50 mesh and title terms Terms<- rownames(top50.mh.ti) # assign rownames to Terms msh<-top50.mh.ti$mesh_oncolytic_virus.txt titl<- top50.mh.ti$title_oncolytic_virus.txt p4 <- ggplot(top50.mh.ti) p4 <- p4 + geom_text(aes(x = msh, y = titl, label = Terms)) p4

Text Mining Analysis

plot 4: most frequent title terms and most frequent mesh terms

Text Mining Analysis

my.corpus.df<- my.corpus.df[ order(my.corpus.df$title_oncolytic_virus.txt, decreasing = T),] xy<- my.corpus.df[1:50,] # assign the 50 most freq title term to xy xy[1:5,] # view the top 5 most freq title term #sort the 50 most freq title term in increasing order (for plot visualization) xy<- xy[ order(xy$title_oncolytic_virus.txt, decreasing = FALSE),] top50.mh.ti<-rbind(xx,xy) # combine top 50 mesh and title terms top50.mh.ti$Term<-rownames(top50.mh.ti) rownames(top50.mh.ti$Term) = NULL colnames(top50.mh.ti)[1] <- "msh" # change col name colnames(top50.mh.ti)[2] <- "title" # change col name

# plot 5: most frequent title terms and most frequent mesh terms

Text Mining Analysis

# plot 5: most frequent title terms and most frequent mesh terms library("reshape2")

# library("reshape2") is used to transform wide format data by means of the melt function. The melt function takes data in wide format and stacks a set of columns into a single column of data.

top50.melt<- melt(top50.mh.ti, measure.vars = c("title", "msh")) top50.melt p <- ggplot(top50.melt, aes(top50.melt$Term, top50.melt$value, colour = variable)) + geom_point() + coord_flip() p5<-last_plot() + scale_x_discrete(limits=(top50.melt$Term)) p5

Reference for reshape package: Hadley Wickham (2007). Reshaping Data with the reshape Package. Journal of Statistical Software, 21(12), 1-20. URL http://www.jstatsoft.org/v21/i12/.

Text Mining Analysis

# plot 5: most frequent title terms and most frequent mesh terms

p5 <- ggplot(top50.melt, aes(top50.melt$Term, top50.melt$value, colour = variable)) + geom_point() + coord_flip() p5

plot 5: most frequent title terms and most frequent mesh terms

Text Mining Analysis

The Power of Three (Part A)
No ratings yet
The Power of Three (Part A)
13 pages
Review of Data-Driven Generative AI Models For Knowledge Extraction From Scientific Literature in Healthcare
No ratings yet
Review of Data-Driven Generative AI Models For Knowledge Extraction From Scientific Literature in Healthcare
20 pages
Introduction To Bioinformatics With R A Practical Guide For Biologists (Edward Curry)
100% (1)
Introduction To Bioinformatics With R A Practical Guide For Biologists (Edward Curry)
308 pages
Cybercrime Famous)
No ratings yet
Cybercrime Famous)
38 pages
Propulsion Block-TTrades Edu
No ratings yet
Propulsion Block-TTrades Edu
9 pages
The LaTeX Mathematics Companion
No ratings yet
The LaTeX Mathematics Companion
78 pages
Modern Theory of Vectors and Tensors in Mechanics and Engineering
No ratings yet
Modern Theory of Vectors and Tensors in Mechanics and Engineering
126 pages
PBT Cyberlaw
No ratings yet
PBT Cyberlaw
15 pages
Footprint Setups
No ratings yet
Footprint Setups
5 pages
Complex Number HSC NSW Extension 2 Maths
No ratings yet
Complex Number HSC NSW Extension 2 Maths
58 pages
The Rise and Fall of The Dollar VS Rupee Since 1947
No ratings yet
The Rise and Fall of The Dollar VS Rupee Since 1947
2 pages
Monitoring-Times Magazine Mar 1996
100% (1)
Monitoring-Times Magazine Mar 1996
116 pages
Does A Cosmic Ether Exist Evidence From PDF
No ratings yet
Does A Cosmic Ether Exist Evidence From PDF
36 pages
R For Bioinformatics
No ratings yet
R For Bioinformatics
53 pages
Introduction To R Day 1
No ratings yet
Introduction To R Day 1
42 pages
Ict .
No ratings yet
Ict .
81 pages
Data Trading Profile Exercise Book
No ratings yet
Data Trading Profile Exercise Book
134 pages
A Little Book of R For Bayesian Statistics: Release 0.1
No ratings yet
A Little Book of R For Bayesian Statistics: Release 0.1
27 pages
Current Topics in Bioinformatics
No ratings yet
Current Topics in Bioinformatics
21 pages
Instructions For Using R To Create Predictive Models v5
No ratings yet
Instructions For Using R To Create Predictive Models v5
17 pages
Measurement Systems Analysis - How To
100% (2)
Measurement Systems Analysis - How To
72 pages
Smart Money Technqiue PDF by DayTradingRauf UPDATED
100% (1)
Smart Money Technqiue PDF by DayTradingRauf UPDATED
6 pages
Study Manual Mizan Sir - Economics & Statistics
No ratings yet
Study Manual Mizan Sir - Economics & Statistics
249 pages
Mystery Ranch Backpacks 2013 Hunting Catalog
No ratings yet
Mystery Ranch Backpacks 2013 Hunting Catalog
20 pages
Epidemiology W R
No ratings yet
Epidemiology W R
240 pages
Ifw Deep Dive R-Quick Guide
No ratings yet
Ifw Deep Dive R-Quick Guide
12 pages
Mechanical Level Switches - 912 PDF
100% (1)
Mechanical Level Switches - 912 PDF
60 pages
Tatistical Nalysis With: Course Outline
No ratings yet
Tatistical Nalysis With: Course Outline
11 pages
Basic Keys To Ict Trading by Taha Trading
No ratings yet
Basic Keys To Ict Trading by Taha Trading
14 pages
KRR Semid
No ratings yet
KRR Semid
4 pages
Stadistica Biomedica
No ratings yet
Stadistica Biomedica
37 pages
Ict Daily Bias Innercircletrading - Website
No ratings yet
Ict Daily Bias Innercircletrading - Website
5 pages
Tuesday Expo Daily News (2014)
No ratings yet
Tuesday Expo Daily News (2014)
112 pages
Intro R Biologists
No ratings yet
Intro R Biologists
29 pages
Volume Profile 部分9
No ratings yet
Volume Profile 部分9
5 pages
Defining The Weekly Range Profiles - White Backgroud
No ratings yet
Defining The Weekly Range Profiles - White Backgroud
32 pages
Lab 1 Manual - Introduction To R
No ratings yet
Lab 1 Manual - Introduction To R
7 pages
Trading Setups
No ratings yet
Trading Setups
5 pages
How To Read Price Action Chart - Discover A Simple Price - King, Profit - 2023 - Independently Published - Anna's Archive
No ratings yet
How To Read Price Action Chart - Discover A Simple Price - King, Profit - 2023 - Independently Published - Anna's Archive
33 pages
First Lecture - Defining Economic Indicators and Big Data
No ratings yet
First Lecture - Defining Economic Indicators and Big Data
52 pages
R NGS
No ratings yet
R NGS
29 pages
Address Resolution Protocol
No ratings yet
Address Resolution Protocol
8 pages
2024 Complete Professional Course (Central Forex Institute)
No ratings yet
2024 Complete Professional Course (Central Forex Institute)
10 pages
Sage X3 - User Guide - HTG-SMC3 Rate Shopping PDF
No ratings yet
Sage X3 - User Guide - HTG-SMC3 Rate Shopping PDF
17 pages
Semtex-User Guide
No ratings yet
Semtex-User Guide
75 pages
Introduction To Bioinformatics: Useful and Free Tools For Microarray Data Analysis
No ratings yet
Introduction To Bioinformatics: Useful and Free Tools For Microarray Data Analysis
28 pages
0067, TIEger Auto-Tie Baler - Shear
No ratings yet
0067, TIEger Auto-Tie Baler - Shear
134 pages
Crypto Predictions
No ratings yet
Crypto Predictions
26 pages
AI Handout For CS by Mengistu E Mod
No ratings yet
AI Handout For CS by Mengistu E Mod
29 pages
R Programming Unit 1
No ratings yet
R Programming Unit 1
22 pages
"Cyber Crime Against Individuals": Submitted by
No ratings yet
"Cyber Crime Against Individuals": Submitted by
14 pages
KX T7665 Brochure
100% (1)
KX T7665 Brochure
2 pages
Weekly Profile
No ratings yet
Weekly Profile
9 pages
R12 E-Business Tax - Part 1
No ratings yet
R12 E-Business Tax - Part 1
26 pages
EpidemiologyUsingR PDF
No ratings yet
EpidemiologyUsingR PDF
302 pages
Genesis Matrix Trading - Indicators - ProRealTime
No ratings yet
Genesis Matrix Trading - Indicators - ProRealTime
1 page
LoadRunner HPO Queston1
No ratings yet
LoadRunner HPO Queston1
42 pages
A Case Study On International Construction Projects: Stadiums and Arenas
No ratings yet
A Case Study On International Construction Projects: Stadiums and Arenas
33 pages
Vol-5 Introduction To Fundamental Analysis
No ratings yet
Vol-5 Introduction To Fundamental Analysis
10 pages
PTW Associated Certificates KEC - Final v3 - SCAFFOLDING
No ratings yet
PTW Associated Certificates KEC - Final v3 - SCAFFOLDING
1 page
Hilux
No ratings yet
Hilux
1 page
Applied Epidemiology Using R PDF
No ratings yet
Applied Epidemiology Using R PDF
302 pages
RSRM - Reusable Solid Rocket Motor
No ratings yet
RSRM - Reusable Solid Rocket Motor
4 pages
Stock Prediction Using Twitter Sentiment Analysis: Anshul Mittal Anmittal@stanford - Edu Arpit Goel Argoel@stanford - Edu
No ratings yet
Stock Prediction Using Twitter Sentiment Analysis: Anshul Mittal Anmittal@stanford - Edu Arpit Goel Argoel@stanford - Edu
5 pages
Data Sheet Acquired From Harris Semiconductor SCHS049C Revised October 2003
No ratings yet
Data Sheet Acquired From Harris Semiconductor SCHS049C Revised October 2003
26 pages
Effective Maintenance Management: July 2023
100% (1)
Effective Maintenance Management: July 2023
9 pages
Gm-Afr-hseq-201 Guide To Cleaning and Degassing Underground Tanks For Ne...
No ratings yet
Gm-Afr-hseq-201 Guide To Cleaning and Degassing Underground Tanks For Ne...
26 pages
Deadband+hysterisis Estimation
No ratings yet
Deadband+hysterisis Estimation
4 pages
CF5092 Talleres de Escoriaza SAU - 379770....
No ratings yet
CF5092 Talleres de Escoriaza SAU - 379770....
7 pages
531201650020PMAdvertisement No. 47 With Challan
No ratings yet
531201650020PMAdvertisement No. 47 With Challan
13 pages
Trade Score PDF
No ratings yet
Trade Score PDF
30 pages
Intro To R
No ratings yet
Intro To R
4 pages
Weights SS Flanges
No ratings yet
Weights SS Flanges
4 pages
OrigAfric Policies
No ratings yet
OrigAfric Policies
4 pages
Ict Terminology
No ratings yet
Ict Terminology
2 pages
7.4 Trading Management: 7.4.1 ENTRY
No ratings yet
7.4 Trading Management: 7.4.1 ENTRY
43 pages
Vicinity Map-Wps Office
No ratings yet
Vicinity Map-Wps Office
6 pages
Valj
No ratings yet
Valj
4 pages
Capture D'écran . 2024-11-19 À 21.20.30
No ratings yet
Capture D'écran . 2024-11-19 À 21.20.30
12 pages
BBB Fundamental - 23
No ratings yet
BBB Fundamental - 23
24 pages
PT. Starcam Apparel Indonesia - InCheck - February 2024
No ratings yet
PT. Starcam Apparel Indonesia - InCheck - February 2024
3 pages
Ai 02 - 2024 - Caam Announces Revision To Its Fees and Charges
No ratings yet
Ai 02 - 2024 - Caam Announces Revision To Its Fees and Charges
12 pages
Software Engineering Project Planning and Scheduling
No ratings yet
Software Engineering Project Planning and Scheduling
10 pages
P-306L - Product Data Sheet
No ratings yet
P-306L - Product Data Sheet
2 pages
FX Training Material
No ratings yet
FX Training Material
6 pages
Cryptocurrency, FOREX and CFD Liquidity
No ratings yet
Cryptocurrency, FOREX and CFD Liquidity
4 pages
Table of Content: Introdution Company Overview Product of Xerox Trademark Bibliography
No ratings yet
Table of Content: Introdution Company Overview Product of Xerox Trademark Bibliography
22 pages
ICT Mentorship 2022 - Episode 2 - Notes
100% (1)
ICT Mentorship 2022 - Episode 2 - Notes
5 pages
NoSQL Injection for Elasticsearch
From Everand
NoSQL Injection for Elasticsearch
Gary Drocella
No ratings yet
Inspiring Powershell Articles
From Everand
Inspiring Powershell Articles
Murat Yildirimoglu
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Search Pubmed With R Part1Part2

Uploaded by

Search Pubmed With R Part1Part2

Uploaded by

Search Pubmed with R Part1 and Part2

Query pubmed titles for oncolytic virus using RISmed

Query pubmed MESH topic for oncolytic virus using RISmed

View results in excel

Column containing titles

Column containing corresponding Mesh terms

Preparing forText Mining Analysis

Text Mining Analysis

# my.corpus <- tm_map(my.corpus, removeNumbers) # Removes

Text Mining Analysis

my.corpus.df[200:250,1:2] # view some of the terms copy.my.corpus.df<-my.corpus.df # make a copy of my.corpus.df

Text Mining Analysis

order(my.corpus.df$mesh_oncolytic_virus.txt, decreasing = T),]

# assign the 50 most freq mesh term to xx

Text Mining Analysis

Text Mining Analysis

Text Mining Analysis

Terms<- rownames(xx) Title.count<-xy$title_oncolytic_virus.txt

Text Mining Analysis

Text Mining Analysis

Text Mining Analysis

Text Mining Analysis

Terms<- rownames(xx) Title.count<-xy$title_oncolytic_virus.txt Mesh.count<-xy$mesh_oncolytic_virus.txt

p3<-last_plot() + coord_flip() p3<-last_plot() + scale_x_discrete(limits=(Terms)) p3

Text Mining Analysis

Text Mining Analysis

Text Mining Analysis

Text Mining Analysis

Text Mining Analysis

Text Mining Analysis

p5 <- ggplot(top50.melt, aes(top50.melt$Term, top50.melt$value, colour = variable)) + geom_point() + coord_flip() p5

Text Mining Analysis

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.