Event box

Harvest data from the web with R

Harvest data from the web with R In-Person

About the winter school

As a new initiative, KUB Datalab has organised a so-called winter school. The aim is to give participants the opportunity to engage intensively with R for an entire week. You will therefore be able to tailor a programme that suits your specific needs and level of experience. Please note that you must register for each course individually.

In short, it is up to you whether you wish to register for one or more courses.

About the course

Websites contain a lot text and data. In this course you will learn how to harvest the parts of a webpage that you are interested in and want to work with. In the course you will learn how to inspect a website and identify the parts of it that you want to harvest, whether it be numbers or text, and use R to transform it into a workable format. 

This is called web scraping. One of the advantages is that we can harvest data and text that is spread over many pages instead of having to copy-paste each page manually. In addition, we can ensure that it is in a format that allows us to work with page's content. In this course we will harvest both numbers in tables about the demographics of University of Copenhagen Students as well as text about a political and societal issue, both past and present

 

We assume that you have some experience working with R, and know the tidyverse. The level required is what you would learn on one of our introductory course in "R for absolute beginners". 

You are kindly asked to have R and Rstudio installed before the course. You can install R for Windows here and for Mac here. You can install Rstudio for Windows and Mac here

The course will be held in English if any participant requests it. If everyone speaks Danish it can be held in Danish

Date:
30/01/2026
Time:
12:00 - 14:30
Time Zone:
Central European Time (change)
Location:
Library Lighthouse, zone 1, Library Lighthouse, zone 2
Campus:
KUB North Campus, Nørre Allé 49, 2200 København N
Categories:
  Cleaning     English     Harvesting     R     Datalab  

Registration is required. There are 39 seats available.

Event Organizer

Søren Willer Hansen