1 Iterative Study Design

At this early stage of the project, the goal is to put together a solid research plan and get your whole team on the same page. This is an iterative process, meaning that you’ll start organizing your thinking on study aims and design and do some initial data exploration. Then you’ll check-in with your mentors and collaborators and revise your study design and methods, and check back in with your team to see if further changes are needed. While you do this work, you can plan ahead and start assembling the information you’ll need to eventually write a manuscript for peer review.

1.1 Keep yourself organized

It pays to get organized at this early stage of the project. This will save you time as you get to the later stages of analysis and writing.

1.1.1 Tool: Project Log

You need a system to organize project-specific notes and tasks, particularly when you’re conducting multiple projects at the same time. I use a basic Markdown file as my Project Log, which is essentially a digital lab book. You could also use a Word doc, paper journal, or a task management app. I organize the Project Log into two section: Meta and the Daily Log.

The Meta section includes:

  • Project Information, including full list of authors’ full names, affiliations, and email. You’ll need this when you fill out the forms to submit manuscripts to journals (and you can expect to have to submit to multiple journals). You can also keep a list of conferences where you presented your project at; sometimes journals ask for this information, too.
  • Journal Information, including a list of journals you’d like to submit to in order of preference.
  • Data, including each dataset’s name, a short description of where it came frame, and any relevant citations. Down the road, a colleague may ask you where you got a particular dataset so they can use it in their own projects, or a journal may ask you to write a data availability statement describing where readers can find the data you used. Capture this information now so you don’t need to dig it up later.
  • Tasks, or a to do list. This is a place to list project-specific work that you know you’ll need to get done in the short or long term. Also, before you close out a work session, you can leave a short short note here to remind yourself where to pick up next time.

In the Daily Log section, you can take running notes on decision you’re making throughout the project. You don’t need to write down everything, focus on decision points that you might want to refer back to later. For example, you could take notes from a team meeting on which dataset you decided to use and where you downloaded it from, or which journal your team ended up submitting to first and why you chose it.

The Project Log is a tool for you, no need to share it. Think ahead to where you’ll be six months from now, putting the finishing touches on your manuscript before you submit it for peer review. Write in a way that you think will make sense to you at that stage, so that you can quickly find information you need.

Download a daily log template.

1.2 Design your study

Once you’ve framed a research question (or set of questions) that you’d like to investigate, it’s time to outline your initial thoughts. At this stage, you should be prepared to write up the context of your research question(s), how your study builds on prior research, and what data and methods you plan to apply to conduct your study.

Note that how to frame a good research question and how to design a sound study are outside the scope of this Handbook.

You don’t need to prove you’re a good scientific writer at this stage. Save that for later! Your aim should be to efficiently organize your thoughts, get feedback, and iterate, before you make the time investment in polishing your writing.

1.2.1 Tool: Write a high level outline

The high level outline can help you organize your thoughts and receive feedback. This tool comes from Luby and Southern (2023), The Pathway to Publishing: A Guide to Quantitative Writing in the Health Sciences, and is a helpful organizing document for both the writer and anyone reviewing the document. In this document, the writer briefly summarizes the study’s context, the gap it addresses, and its objectives. The writer also concisely describes the study design, data, and methods, outlines tables and figures, and anticipates strengths, limitations, and prospective findings and conclusions.

As described by Luby and Southern:

“The role of the high level outline is to sketch out the major components of the manuscript that will support the data analysis included in the framing document [another writing tool suggested by Luby and Southern]. This is an outline,that should be no longer than 1500 words (excluding the tables, figures and references). Full sentences are not necessary.

“Keeping the document short helps the author focus on the key elements of the manuscript, and provides early high level input. Because a short document takes less time for authors to produce and less time for co-authors to review, it generates prompt feedback on key ideas, and so supports a faster path to publication. Using this approach prevents authors investing weeks or months developing full draft manuscripts, that are off target with pages and pages of prose that need to be discarded.”

I recommend reading Luby and Southern (2023) to learn more about their approach to scientific writing and to see templates of the high level outline and other writing tools.

1.2.2 Tool: Data memos for preliminary analyses

A data memo is an RMarkdown (for R users) file that describes something about your data. In the design stage, this is mainly for getting to know your datasets in a way you can share with your colleagues. With RMarkdown, you can embed ‘chunks’ of R code into a text document that’s easy to export as a PDF or HTML file, allowing you to analyze data and share summary statistics, plots, and tables without your colleagues needing to work with raw R code (and without you needing to generate a second document). For Python users, Jupyter Notebook has some similar capabilities.

Here’s an example data memo. When I wrote this data memo, I was exploring the idea of examining the associations between historical redlining and the siting of oil and gas wells in the California. As you can see in the memo, I was able to determine that there was enough overlap between wells and redlined neighborhoods in Los Angeles to justify further work on this research question. Once I generated this data memo, it was easy to share it with my collaborators. This first preliminary analysis evolved into a peer-reviewed publication in the Journal of Exposure Science and Environmental Epidemiology, where we found that oil and gas wells were concentrated in historically redlined neighborhoods in cities across the United States.

1.3 Ask for feedback

Different mentors and research groups have different approaches to providing feedback. Ask your mentor about their approach, how they like to provide Ideally you have a network of formal mentors ready to provide feedback, including your adviser and other members of your lab group. Also: community partners [expand]

1.4 Iterate

Revise your high level outline. Do more exploratory analyses. Talk to your colleagues. Read similar papers in the literature, paying attention to their framing, methods, and findings. Are there methods you can adapt or build on? Papers you should cite? Hypotheses you can follow up on? Ask for more feedback. Finally, once you, your mentors, and your collaborators agree that you have a solid plan, move on to the Primary Research and Writing stage.

1.5 Talking about authorship

It’s good practice to talk about authorship at the early stages of a research project. This can be a little awkward, but that’s okay, it’s an important and normal conversation to have. The principal investigator (PI) or senior author (often the last author listed on a manuscript) may have the authority to make suggestions for who should be authors and in which order. That said, PIs can get busy and lose track of the current status and authorship of their various projects. Ideally, a PI will foster an environment where students and trainees can routinely raise questions about authorship and project status.

A PI may have their own system for tracking their various projects, such as a paper tracker. The paper tracker is a tool to assemble research ideas, denote which members of research teams are working on which projects, and encourage lab members to have conversations and get organized. This kind of tool can promote collaboration and transparency, particularly for large projects with several outputs (papers, presentations, reports, etc.).

1.5.1 Authorship expectations

The expectations for what it takes to be an author on a peer-reviewed study vary between fields. For researchers in the biomedical sciences, the International Committee of Medical Journal Editors (ICMJE) has come up with four recommended authorship criteria:

  1. Substantial contributions to the conception or design of the work; or the acquisition, analysis, or interpretation of data for the work; AND
  2. Drafting the work or reviewing it critically for important intellectual content; AND
  3. Final approval of the version to be published; AND
  4. Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

According to ICMJE, all of these criteria must be met for a someone to qualify as an author.

It’s also a good practice to set expectations authorship expectations early, at the design stage of a project. Come up with a division of labor that everyone agrees to and feels good about. Who will be assembling the data? Writing code? Interpreting results? Writing sections of the manuscript?