workspace-course-topics

Ask AI help to improve your REMARK

Timing: Assigned in Class 12, due before Class 13.

Prerequisites: You have a REMARK-in-progress for your course paper — the repository structure from Paper topic choice and REMARK setup plus whatever you have built since. You have completed Ask an AI to assess REMARK compliance (baseline tier). This assignment is a focused follow-up that targets substantive improvements rather than checklist compliance.


Goal

Use a strong AI model (Claude Opus 4.7 recommended) as a critical reviewer of your REMARK repository. Have it read your code, configuration, and documentation against the standards at econ-ark/REMARK, and produce a prioritized list of substantive improvements — changes to reproducibility, model fidelity, documentation completeness, or code quality. Not cosmetic changes.

Then pick the top 3–5 and actually make them, as a branch + PR against your own REMARK repo’s upstream.


Why a strong model

Critical review of a codebase — finding places where reproduce.sh will silently fail, spotting a subtle departure from the paper’s model, identifying the parts of the documentation a new reader cannot follow — is exactly the kind of work that rewards a strong model. Use Claude Opus 4.7 in Cursor or Claude Code, with the agent in read-and-analyze mode (not auto-edit mode) so you see every suggestion before it lands. If Opus 4.7 is unavailable, Sonnet 4.6 is an acceptable substitute; avoid Haiku for this task.


Example prompt (adapt to your REMARK)

Please review my REMARK-in-progress at

  <YOUR REPO URL>

against the baseline-REMARK standards at

  https://github.com/econ-ark/REMARK/blob/main/STANDARD.md   https://github.com/econ-ark/REMARK/blob/main/How-To-Make-A-REMARK.md   https://github.com/econ-ark/REMARK/blob/main/README.md

Ignore the published-tier requirements (Zenodo DOI etc.). Focus on the standard / baseline tier.

I want a prioritized list of substantive improvements — not cosmetic ones — categorized by:

  1. Reproducibility. Can reproduce.sh run end-to-end from a clean clone? Where is it fragile, silently failing, or reliant on state that is not captured by binder/environment.yml?

  2. Model fidelity. Does the code compute what the paper actually specifies? Are there silent approximations, simplifications, or departures (e.g., a different utility kernel, a different shock process, a different timing convention) that should be either (a) justified and documented, or (b) fixed?

  3. Documentation completeness. Is CITATION.cff present and correct? Is there a top-level REMARK.md (or equivalent) explaining what the REMARK does, which paper result(s) it reproduces, and how a reader should start?

  4. Code quality. Where would an ambitious follow-up student get stuck — magic numbers, undocumented data dependencies, tight coupling to your local filesystem, environment assumptions?

  5. Anything else a thoughtful reviewer would flag before you submit a PR to the REMARK catalog.

Output format: a numbered list of 5–10 proposed improvements. For each give (a) the problem in one sentence, (b) the proposed change, (c) where it goes (file, function, script, line range). Omit cosmetic or formatting items.


What to do

  1. Adapt the prompt for your REMARK — fill in your repo URL; if your paper has a known approximation or simplification you want the model to flag (or to ignore), say so explicitly in part 2.
  2. Run the prompt in one Cursor / Claude Code thread with Claude Opus 4.7. Record the full model response.
  3. Critically review. Mark each suggestion accept / edit / reject. Suggestions about model fidelity deserve the most scrutiny — the model may miss paper-specific nuance or invent concerns.
  4. Implement 3–5 accepted improvements. Open a PR on your own REMARK repo titled Substance improvements from Claude Opus 4.7 review.

Deliverable

Submit the PR URL and include in the PR description:

  1. The exact prompt you used (adapted from the template).
  2. The full model response (or a link — a gist is fine).
  3. Your accept / edit / reject judgment on each suggestion, with one-line reasoning.
  4. Your before / after diff is the PR itself.

References