Skip to content

A crawler that collect players information on basketball reference.

Notifications You must be signed in to change notification settings

hanjuTsai/basketballReferenceCrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

d0a0d80 · Dec 6, 2020

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Prject OverView

This project aims to collect 33892 players' biographical information, and award records, providing statistical data for the research group at NTNU (National Taiwan Normal University).

Static Page

Parse plain HTML to fetch the required data. ex. Check whether the player was honored with the Hall of Fame or All-Star showed in the blue box.

Dynamic Page

Since some personal information isn't revealed on requesting the URL, Chromedriver is needed to mock human by clicking a button and expand the section and finally get extended pages.

Collect the following information with plain HTML

  1. Whether the player was admitted into the Hall of Fame
  2. The season in which the player was named to All-Star Game rosters
  3. The players' personal information such as nation, weight, height, educational background

Required Package

  1. Chromedriver version: 2.43.600229

Python package

  1. selenium 3.14.1
  2. requests 2.19.1
  3. bs4 0.0.1
  4. xlrd 1.1.0

About

A crawler that collect players information on basketball reference.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published