Jupyter Hub for Ethical Web Scraping using Python packaged by Data Science Dojo

Data Science Dojo

Jupyter Hub for Ethical Web Scraping using Python packaged by Data Science Dojo

Data Science Dojo

Our Jupyter Instance provides easy to use environment for Web Scraping.

Data Science Dojo delivers data science education, consulting, and technical services to harvest the power of data.

Trademarks: This software listing is packaged by Data Science Dojo. The respective trademarks mentioned in the offering are owned by the respective companies, and use of them does not imply any affiliation or endorsement.

About the offer:

Jupyter Hub for Ethical Web Scraping using Python gives you an effortless coding environment in the cloud with pre-installed Web Scraping python libraries, reducing the burden of installation and maintenance of tasks. The vast amount of data available on the internet is not open and available to download. As a result, web scraping is the most effective technique to collect this data. Through this offer, a user can collect data from various sources. Once the data is collected, it can be further analyzed to get valuable insights into almost everything. The heavy computations required for these applications are not performed on the user’s local machine. Instead, they are performed in the Azure cloud, which increases responsiveness and processing speed.

Who benefits from this offer:

Following can benefit from our instance:

  • Teams of developers
  • Data scientists
  • Machine learning engineers
  • Scientific researcher groups
  • SEO team

What is included in this offer:
  • Pre-installed Web Scraping libraries for python.
  • Ready to go notebooks which consist of example codes through which user can get guidance for working on Web Scraping.
  • Work with multiple notebooks at the same time.
  • Kernel-backed documents enable code in any text file (Markdown, Python, etc.) to be run interactively in Jupyter kernel.
  • Code consoles to run code interactively, with full support for rich output.
Technical Specifications:
  • Recommended memory: 8GB RAM
  • Recommended vCPU:4 vCPUs
  • Operating System:Ubuntu 18.04

Our offer provides repository from following source:

  • Github repository of book Web Scraping with Python Book , by author Ryan Mitchell.

Following Authoring Tools are supported in this offer:

  • JupyterHub
  • Jupyter Lab
  • Terminal

Our instance supports following python libraries:

  • pandas
  • numpy
  • scikit-learn
  • beautifulsoup4
  • lxml
  • MechanicalSoup
  • requests
  • scrapy
  • selenium
  • urllib3

The default port JupyterHub listens to is 8000. You can access the web interface at http://yourip:8000 using the credentials

  • username:guest
  • password:guest@123
The Jupyter Trademark is registered with the U.S. Patent & Trademark Office.