r - tm corpus exporting some structural function words -


using tm library, corpus includes words vector source structure :

text <- readlines("some.txt")  finalcorpus <- corpus(vectorsource(newcorpus)) finalcorpus <- tm_map(finalcorpus, stripwhitespace) save(finalcorpus, file="data/debug.rda")# debug df<- data.frame(lapply(finalcorpus, as.character), stringsasfactors=false) df >protracted periods meditation fasting prayer ennui fever energy vigor >married joseph lee dollars million canadian dollars gbp pastored african >american church snow hill jersey children died infancy **meta list author >character datetimestamp list sec min hour mday mon year wday yday isdst >description character heading character id language en origin character >x2   x3 >1 list list** 

the words between ** corpus , not imported text, why them et how remove them (without removewords tm function) ?


Comments

Popular posts from this blog

Hatching array of circles in AutoCAD using c# -

ios - UITEXTFIELD InputView Uipicker not working in swift -

Python Pig Latin Translator -