Skip to content
/ ScrapeM Public

A monadic web scraping library

License

Apache-2.0, Unlicense licenses found

Licenses found

Apache-2.0
LICENSE
Unlicense
LICENSE.txt
Notifications You must be signed in to change notification settings

gusty/ScrapeM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

5978ada · Oct 10, 2018

History

17 Commits
Dec 10, 2016
Dec 10, 2016
Dec 26, 2017
Dec 10, 2016
Apr 19, 2018
Dec 26, 2017
Dec 10, 2016
Dec 10, 2016
Dec 10, 2016
Dec 7, 2016
Dec 10, 2016
Oct 10, 2018
Dec 10, 2016
Apr 19, 2018
Apr 19, 2018
Apr 19, 2018
Apr 19, 2018
Dec 10, 2016
Dec 10, 2016
Dec 10, 2016
Dec 26, 2017
Dec 10, 2016
Apr 19, 2018
Dec 26, 2017

ScrapeM

A monadic web scraping library

This library makes web scraping easier by providing ways to automatically maintain state through different request, handling cookies, form submission and http headers.

One function to scrap'em all

This is essentially a single-function library which integrates many existing libraries and present several ways to approach web scraping by using different monads.

All other common functions used here come from different libraries like FSharp.Data, Http.fs and F#+

Scrapes the web with category

It's possible to create stateful linq-style queries which simulates basic user interaction with form submission by using different flavours of State monads. Also sequences expressions are available to integrate the data being extracted from multiple webpages in the same query.

Getting started

Important: At the moment this library is in a 'Prototype' stage

Recommended: Visual Studio 2017 to avoid slow compile time of generic code

In order to try the examples run:

> build.cmd // on windows    
$ ./build.sh  // on unix

Now you can try the sample files: