Skip to content

franfranz/Word_Frequency_Toytools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 

Repository files navigation

Word Frequency Toytools

Code to generate/annotate/handle lists of frequency from corpora.

Normalize Word Frequency v0.1.5

R code to normalize raw frequency counts into fpmw, fpbw, zipf, zipf per billion and other popular measures to indicate word frequency. To use Normalize Word Frequency :

Prepare your input file:

  • make sure your txt or csv input files have a header: the column with raw frequency you want to normalize must be called "Frequency"

Set input specifications in the code:

  • set the paths for input and output files (line 28-30)
  • set the file extension (36)
  • set the file separators (48)
  • set size of corpus (70) - this version reports the size of Itwac.

Set output specifications in the code:

  • choose what transformations to apply by commenting/ uncommenting (56-65)

Normalize Word Frequency v0.1.4

This version has been deprecated.

About

Handy frequency lists from corpora and a few related utilities.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy