Skip to content

Welcome to RFS Data#

Overview#

This repository contains scripts and tools used to extract, manipulate, clean, and store data from technologies used by RFS, as well as facilitate our various workflows.

Project layout#

docs/                               # Documentation Files
main/           
    query.py                        # Utility script for automating common BigQuery edits

    blocks/
        exports.py                  # Functions supporting the extraction of different export types from Blocks
        processing.py               # Functions used to process and clean data extracted from Blocks
        router.py                   # DEPRECATED | Script used to route data from BigQuery to Google Sheets
        scraper.py                  # Launchable script to extract data from all active Blocks projects
        util.py                     # Non-specific utility functions

    evergiving/                     # Scripts prepared for integration with the Evergiving platform,
                                    # largely undeveloped

    gas/                            
        pr_data_staff.js            # Google Apps Script used to add datastaff email to all protected ranges

    paycom/
        common.py                   # Functions used to interact with the Paycom platform
        processing.py               # Functions used for the processing of data extracted from Paycom
        standard.py                 # Core script for the scraping and ingesting of paycom hours, scheduling, 
                                    # and staff
        ...                         # Other configuration files, tokens, etc.

    reach/
        no-bq/
            reach_routing.py        # DEPRECATED | Script used to route BigQuery data from Reach into 
                                    # Google Sheets
            reach_scraper.py        # DEPRECATED | Script used for the scraping and importing of data 
                                    # from Reach exports into Google Sheets.

    rfslib/                         # Core library containing many helper functions used in platform-specific
                                    # scripts
        api_assist.py               # Utility functions related to interacting with Google APIs
        etc.py                      # Utility functions related to general data manipulation and other 
                                    # non-specific functions
        net_assist.py               # Utility functions related to Selenium and network data
        projclass.py                # Classes used to bundle different kinds of project data

    tools/
        redactor/
            redactor.py             # Prototype auto-redaction script for VR projects
        bat/                        # Batch files used for scheduling scripts

    van/
        chrome_login.py             # Utility script for logging into specific selenium profiles
        instance.py                 # Launchable script to scrape specific VAN instances
        merge_all.py                # Launchable script to merge staging tables for all active projects 
        merge_edits.py              # Launchable script to merge QC department changes into BigQuery
        run.py                      # Launchable script for most VAN functionality
        turf_all.py                 # Launchable script to extract turf data from all active projects
        .conf/
            mode_def.yaml           # Library of different modes for main.van.run
            proj_conf.yaml          # Project level configuration file for van scripting
            standard_qc.yaml        # General set of qc standards, overrideable in proj_conf.yaml
        lib/
            etc.py                  # Utility functions related to file manipulation and other
                                    # non-specific functions
            repairs.py              # Utility functions related to sanitizing data extracted from VAN
            scraping.py             # Utility functions related to the scraping of VAN data
        modules/
            merge.py                # Functions relating to the merging of data into primary
                                    # BigQuery tables
            processing.py           # Core script for the processing of door-level VAN data
                                    # into list-level data with associated metrics
            turf.py                 # Core script for the extraction and processing of VAN turf
            walk.py                 # Core script for the extraction and uploading of door-level VAN data