[document, code, workflow, reference, comment, documentation]


Overview

Documenting your project’s workflow, not only for others, but also for your future self (i.e., if you plan to continue working on the project after a while) is absolutely crucial to the long-term success of you as a researcher or analyst.

Typically, you would like to

Main Project Documentation

You should place a main project documentation in the root directory of your project (/my_project), and call it readme.txt. Keep the document brief and simple, but include at least the following information:

  • Project name
  • Details about the project
    • Project description (“what does the project do?")
    • Authors and email addresses
    • Date of last update
  • Build instructions
    • Dependencies (“what software is needed to replicate the project?")
    • Explaining the directory structure (“where to find what?")
    • How to run/build the project

Here is an example documentation you can use as a template:

===================================================================
PROJECT NAME
===================================================================

DESCRIPTION:
------------
Put project description here. You can use multiple lines, but keep
the width of the text limited to the
header.

AUTHORS:
--------
Hannes Datta, h.datta@tilburguniversity.edu (maintainer)

LAST UPDATED:
-------------
29 NOVEMBER 2019


BUILD INSTRUCTIONS
==================

1) Dependencies

Please follow the installation guide on
https://www.tilburgsciencehub.com/ for

- R and RStudio (3.6.x)
  Install the following R packages:

	packages <- c("data.table", "ggplot2")

	install.packages(packages)

- Gnu Make
  Put GnuMake and R to path so that you can run it
  from anywhere on your system. See http://www.tilburgsciencehub.com/

- Obtain raw data files and put them into /data/

2) Directory structure

The project pipeline consists of the following stages:

/src/collect                Code required to collect/download raw data
/src/data-preparation       Data preparation
/src/analysis               Data analysis
/src/paper                  Stores literature reference, paper, and slides

Each directory has a makefile, with running descriptions
for each stage of the pipeline.

For each pipeline stage, the /gen directory contains
files generated on the basis of the /data and
source code stored in /src.

Each directory contains subdirectories,
	/input (for input files)
	/output (for final output files)
	/temp (for any temporary files)
	/audit (for any auditing files)

3) How to run the project

Navigate to the project's root directory, open a terminal,
and run

> make

Documentation for each stage of the pipeline

Ideally, a makefile lists all the necessary steps to run your pipeline. If you do not have a makefile yet, include a readme.txt instead.

Here is a readme.txt template to start from:

OVERVIEW
====================================================
- Provide a two or three sentence overview of the directory.

DESCRIPTION
==========================================================
- If you are using a makefile (strongly recommended!),
  please refer to the content of that file for running instructions.

- If you do not make use of a makefile, please briefly describe
  the contents of the subdirectory and its files.
  Also provide instructions how to run the files, and in which order.