Parsing Javascript simply


Graham Cox
 

Hi all,

I made an app that scrapes web pages looking for a specific tag - namely, the <video> tag to get the address of a video stream. If you display the page in e.g Safari, the video portion can be right-clicked and it gives you a “Copy Video Address” menu that extracts the URL. My app is intended to obtain that same URL.

On some sites, the video address is generated by some obfuscating Javascript rather than simply embedded in the HTML of the page. Nevertheless once displayed, Safari still allows you to manually copy the video stream’s URL, so the obfuscation doesn’t provide any real security - it just makes my page scraping effort more difficult.

My question is, is there a way to run the Javascript using some built-in classes to get the final page rendering, so that the video address can be obtained? I don’t want to write my own Javascript parser, that’s crazy. Besides, I don’t know Javascript very well, so I’d rather just leave it to some existing code, then make use of its output. Is this possible, or what strategy can I use?

—Graham

Join {cocoa@apple-dev.groups.io to automatically receive all group messages.