javascript - How to scrape a live java script webpage in R? -
i scrape play play http://stats.statbroadcast.com/statmonitr/?id=107165. link bring "split box" tab. interested in scraping play play tab home stats , visitors stats tab. 1 of problems no matter tab switch url never changes. if use selector gadget css-selector main contents of tabs same well, "#stats". novice @ web scraping , of time can scrape html page package rvest
, unfortunately lost how should proceed javascript. have heard of json, not sure how combat issue of tabs having same url.
my main goal able scrape play play, home stats, , visitor stats tab when game live.
any appreciated. please let me know if should provide more info.
you can use rselenuim
follows:
require(rselenium) rselenium::startserver() remdr <- remotedriver() remdr$open() remdr$navigate("http://stats.statbroadcast.com/statmonitr/?id=107165")
now firefox window should open can browse normal. doc <- remdr$getpagesource()
gives source-code of current webpage. can use rvest
scrape code follows:
doc <- remdr$getpagesource()[[1]] require(rvest) current_doc <- read_html(doc)
if want automate "browsing" can eg. navigate "play play"-page follows:
webelem <- remdr$findelement(using = "css selector", '#bb_b6') remdr$mousemovetolocation(webelement = webelem) remdr$click(1)
at end: close remote driver ans shut down selenium-server
#shutdown remdr$close() browseurl("http://localhost:4444/selenium-server/driver/?cmd=shutdownseleniumserver")
for more details see: https://cran.r-project.org/web/packages/rselenium/vignettes/rselenium-basics.html
edit: current_doc
caputures website when execute doc <- remdr$getpagesource()[[1]]
. not realtime like. 1 time picture.
if want scrape "period i" follows: 1st navigate "play play" (as shown above) - sys.sleep(3)
till website loaded - navigate "period i" same way navigated "play play" css-selector.
have @ remote-driver (aka browser window control) if arrived @ "period i" webpage.
after arrived execute doc <- remdr$getpagesource()[[1]]
, analyse content.
Comments
Post a Comment