Observations from External Projects

CATcher:

MarkBind:

RepoSense:

TEAMMATES:

CATcher

ARIF KHALID

Project: FreeCodeCamp

FreeCodeCamp is a friendly community where you can learn to code for free.
The organisation houses a number of open source projects such as its online learning platform, mobile app, developer news publication, vscode extension, and more.
All of its open source solutions work towards the noble goal of providing high quality technical education for free.

I have focused my contributions to its curriculum and web application codebase which allows users to complete incremental tasks and build lightly guided projects, giving them confidence in tackling their own projects and easing them out of tutorial hell.

My Contributions

  1. Update instruction verbiage
    • A good first issue for me, requiring no implementation logic but a lot of effort understanding and finding the part of the large code base to change
    • Improve instruction for a particular step to imply adding a css property instead of changing one
  2. Fix backend code source submission
    • A long-standing issue that required deeper understanding of the redux library that was used heavily to handle errors as well as the submission workflow
    • Added warning against private source code link submissions
    • Handled submission errors due to invalid source code links
    • Added playwright tests for the new behaviour
  3. Persist editor open tabs
    • A new feature to persist the open tabs of a multi-file course as users move to future steps
    • Simple implementation logic but many hours had to be spent understanding the multifile editor and challenge steps workflow
    • Used the previous state as a basis to initialize the state of the next multifile editor
    • Added playwright tests for the new behaviour
  4. Editor visual file indication
    • An additional UI above editors to display which file the editor corresponded to
    • UI logic requiring lookup and use of predefined CSS variables
    • Added a cohesive line at the top of editors showing which file it corresponded to
    • Fixing of playwright tests to pass with the new changes
  5. Improve instruction clarity
    • A long-standing issue where the particular challenge rejected an answer utilising string interpolation
    • No implementation logic and my improved understanding helped me to find the offending area much quicker
    • Added improved instruction clarifying that the value should be used directly rather than through string interpolation

My Learning Record

Give tools/technologies you learned here. Include resources you used, and a brief summary of the resource.

  1. Redux
    • Redux is a JS library for predictable and maintainable global state management. At the base level, redux makes managing global state easier with easy to understand api and a host of enhancements that work out of the box. I think of it as a wrapper around react context. While it is possible to implement everything redux does without it and using react context, redux exposes more concise api, reducing boilerplate code, as well as enhances its base functionality with features such as middleware to handle side effects of accessing the storage. Redux is heavily used in the codebase of FreeCodeCamp, centralising state management and the persistent storage solution. Redux actions are used to trigger state changes for almost all interactions in the app. Therefore, in order to contribute meaningfully to more than just UI updates like I have, Redux was one of the first things I had to learn.
    • Resource used: Redux official docs
      • Comprehensive official documentation on redux
      • Used in conjunction with study of the implementation on FreeCodeCamp
  2. Playwright
    • Playwright is an end-to-end testing framework packed full of everything you need to comprehensively test any web app. FreeCodeCamp strongly enforces end-to-end testing, while putting less emphasis on unit testing. In order to contribute more than just UI changes, I had to be able to write tests for new functionality I introduced or bugs I fixed. Thus, I learned not just how to write good tests, but how to set up, run and write tests using Playwright. With the popularity of playwright coupled with the importance of testing, this is a critical skill to have in my career as a software engineer.
    • Resource used: Playwright official docs
      • Comprehensive official documentation on playwright
      • Used in conjunction with examples from the other e2e tests present in FreeCodeCamp
    • Resource used: Making E2E tests resilient and maintainable
      • One of many articles used in finding out how best to write tests
      • Gives a high level look at how tests should be written such as making tests dynamic and making them as concise and necessary as possible
  3. Contributing to open source
    • Open source contributions are very daunting at first and it is difficult to know how to even begin. I turned to a number of resources and guides in order to help me towards starting and continuing to contribute to a significant project.
    • Resource used: Up For Grabs
      • Website aggregating open source projects suitable for first timers
    • Resource used: FreeCodeCamp Contributions guide
      • Good general guidelines to contribute to any open source projects
      • Specific advice and guidance on contributing to FreeCodeCamp such as naming conventions, file locations, etc.
    • Resource used: How to contribute to open source
      • Video containing advice towards contributing to any open source project
      • Useful tips on different aspects of open source for a newbie such as patience, reviewing your own PR, opening issues, reviewing PRs, etc.

Other things of note about FreeCodeCamp

  • FreeCodeCamp relies heavily on github labels
    • When you create a PR comprising of edits to a certain part of the codebase. A bot automatically adds relevant labels to that PR such as scope: curriculum
    • When maintainers review a PR, they use labels to signify its status in the review pipeline
      • status: waiting review is added to allow other maintainers to indicate its suitability for another maintainer to review. I.e, all tests pass and the first reviewer approves or is not familliar enough with that part to review the change
      • status: waiting update is added to indicate a review has been given and it is waiting for the contributor to make the required changes
  • FreeCodeCamp seems to have a hierarchy of maintainers
    • When the first reviewer chances upon a fresh PR, it is common for them to just add the appropriate status: waiting review label and only leave comments on necessary fixes to failing CI/CD tests
    • Most of the time after all the tests pass on a non-trivial PR, the first reviewer wouldn't leave an official review, only add the label
    • It is only after a period of time will then another reviewer leave an actual review
    • This seems to imply a hierarchy where less experienced maintainers go over the bulk of PRs and iron out fundamental issues such as failing critical CI/CD so seniors have less to go through.
  • FreeCodeCamp uses Crowdin to manage open source translations
    • Since FreeCodeCamp aims to be accessible to everyone across the globe it needs to have all its resources translated to different languages
    • To facilitate contributors to this effort, it uses Crowdin, a localization management platform which has similar version control and PR functionality as Github
    • Crowdin allows contributors to proofread the proposed translations and vote on them before they are accepted, similar to upvoting and downvoting on Reddit
    • This method of managing open source translations is not something I have encountered before and seems to fit very well with the open source nature of the project
    • This kind of translation handling could be put to use in our NUS open source projects if they eventually require i18n and are big enough to accept many translation contributions

Li Zhaoqi

Project: Foo

Give an intro to the project here ...

My Contributions

Give a description of your contributions, including links to relevant PRs

My Learning Record

Give tools/technologies you learned here. Include resources you used, and a brief summary of the resource.

NGUYEN KHOI NGUYEN

Project: VSCode

VSCode is an open-source project by Microsoft. It is a code editor that is used by almost every coder.

VSCode runs on Electron for Desktop version, and runs on Javascript (compiled from Typescript) on the web. It does not make use of any javascript framework, because it is older than most of the javascript framework.

I focused mainly on the functionality side of VSCode. I started out with fixing a small bug, and then moved on to contribute features to VSCode.

My Contributions

My Learning Record

  1. Interface - Implementation Separation

VSCode code base heavily makes use of this pattern, where an interface is declared to determine how the clients can interact with a given class. For example, if Panel is a part of the code editor, then IPanel declares the methods that other classes can call on a panel. Other classes then only references the methods declared in the interface, while different types of panels implement IPanel then determines the actual behaviour.

This design makes it easier to develop tests, as test cases can provide stubs to the actual implementations, reducing coupling of components especially in tests. However, this design makes it more difficult for new contributors to trace code 😅, especially when I am not sure which implementation it is supposed to be.

  1. VSCode Extension

VSCode actually has quite a number of default extensions, most of which only activates when you open files of the corresponding languages. My second pull request necessitated me to look at VSCode extension API.

  • The package.json specifies the behaviour of each extension. For html-language-features and typescript-language-features, the file also specifies when the extension is activated.
  • The default entrypoint extension.ts declares the functions to be called when the extension is activated and deactivated. In particular, typescript-language-features adds VSCode disposibles, which contribute features to VSCode, such as syntax highlighting, hover text, definition content, implementations, references, etc. In contrast, html-language-features starts a client that interacts with a server to provide features to VSCode.

My second pull requests touches mainly on html-language-features. In the extension, the responsibility of implementing the features lies within the server.

  1. One Class - One Component

This is consistent with many modern Javascript framework. In VSCode, each component is declared with a class. The actual DOM element is declared as an instance field. Operations that alter the DOM then references this field.

Other Observations

  1. VSCode Engineering Bot

In VSCode, approximately 100 pull requests are merged everyday. VSCode has a dedicated bot to assign a human reviewer to a PR from an external contributor. From my observation, the reviewer assigned is usually the one that has experience in the field, usually the person has commented on or is the creator of the original issue that the PR is addressing.

I believe this bot can be quite useful for the CATcher - WATcher. During off-season time, PRs tend to be ignored, because students do not check the repository regularly for external contributions, and PR creators do not request for review. While it can be an issue where the assigned person already graduated or is MIA, there are possible fixes:

  • Reviewing the PR does not strictly need to be done by the assigned person. Other people can hop in as well. The purpose of the bot is then only to make sure that PRs have at least 1 review.
  • The bot can assign the PR to a second person if the first person does not respond after a period of time.
  1. CLA Agreement

Before the VSCode Engineering bot even assign the PR to a person, new contributors need to sign an agreement. The agreement is raised by another bot via a comment, below which I could just reply to agree. The gist of the agreement was that I would authorise VSCode to use my code, and that I grant them full patent rights.

MarkBind

Lee Hyung Woon

Project: Source Academy

Source Academy is an open source, web-based learning environment for programming developed at NUS. The platform is mainly used to teach CS1101S: Programming Methodology I within NUS, but is also used by overseas universities, such as MIT and Uppsala University.

I have been part of the Source Academy development team for some time; I started as a contributor in terms of the game design and story in AY20/21 Sem 2, and implemented non-trivial features for the game engine directly in AY21/22 Sem 2. However, since then, I have not been able to take a closer look again at the game engine -related codes.

As part of CS3282, over AY23/24 Sem 2 (and the summer break), I have decided to tackle the code quality and issues surrounding the frontend-facing portion of the game engine: the game simulator.

My Contributions

The contribution I have made are as follows:

  1. Game: Rename Story Simulator references to Game Simulator (#2805): This was the first issue I tackled - it had been nearly 2 years since I took a last look at the codebase, and frankly, it took me quite a while to understand the flow of the codebase once again. This was a low hanging fruit -type issue, and it gave me the chance to warm up to the codebase once again.

  2. Game: Remove Object Placement feature from Game Simulator (#2810): This was the first non-trivial PR. After the first PR #2805 was merged, I opened the issue "Game: Refactor and Updates to the Game Simulator (#2806)". This served as the ongoing tracker issue on how we can refactor and polish the game simulator, filled with problems found in the current game simulator code that I noticed while working on PR #2805. In order to remove the Object Placement feature, which had not been used by the game storywriting team for years (a fact I knew, as the sole member of the game storywriting team for years), I had to do a deeper dive into the current architecture of the game simulator. I was able to refactor and remove the portions of code that were necessary for doing this, and this led me to also do a deeper dive on how Phaser 3 engine worked - but I also managed to find even more problems while working on this PR, and that formed the bulk of the problems tackled for the next PR.

  3. Game: Refactor Game Simulator (#2836): This was the major PR I worked on; following #2805 and #2810, I decided to refactor the entire game simulator. This not only included dependency updates and rewording of text, but also re-organizing the code to make it more modular. This was also in part influenced by the earlier work in PR #2810; with this PR #2836, removing or adding a module to the game simulator should now be much easier. The UI of the game simulator was also changed to (1) remove issues with browser window sizing, and (2) use more modern UI components.

  4. Game: Save game state for staff and admin (#3013): After #2836, I took a bit of a break from the coding work to instead focus on planning for a major overwrite of the game engine itself with the rest of the game team. As we did not manage to complete the planning in time for the next semester, I decided to end off the summer break with a trivial feature that was in demand for some time, in time for the upcoming semester. That was this PR.

  5. Game: Fix a bug with displaying published chapter data (#3018): This was another PR to fix a bug right before the start of the upcoming semester.

As mentioned, between the gap from #2836 and #3013, we had been planning for a massive game engine overwrite that aims to remove poor coding standards / practices that accumulated over the years, and instead incorporate interesting approaches to game development by making use of functional programming to declare scenes in phaser. However, we did not manage to complete the planning in time for the next semester, and are instead aiming to complete this work after AY24/25 Sem 2 (i.e. after graduation).

My Learning Record

  1. Phaser: Phaser is a game development framework for JavaScript that is commonly used for web-based games, and forms a core part of the Source Academy game engine. In the past, I did not have to touch much of Phaser, as my other groupmates focused on handling the Phaser portion (while I was mostly in charge of text parsing). Now, with the work on game simulator, I had to become much more familiar with Phaser; using the Phaser Documentation, I was able to get up to speed on the different components. I did find Phaser documentation and online resources somewhat lacking, which did take some time to figure out - I recall that I even had to go into the source code for Phaser to figure out an input-related issue (either keyboard or mouse, one of the two).
  2. Blueprint: Blueprint JS is a UI toolkit for React-based web applications, and is the main UI toolkit used for Source Academy. As someone who was much more familiar with Material UI, Blueprint was an interesting tool to learn about - the approaches to UI development was quite different from that of Material UI or React-bootstrap (which were the UI libraries I was most familiat with).

RepoSense

Chang Si Kai

Project: Foo

Give an intro to the project here ...

My Contributions

Give a description of your contributions, including links to relevant PRs

My Learning Record

Give tools/technologies you learned here. Include resources you used, and a brief summary of the resource.

JONAS ONG SI WEI

Project: jest

Widely-used JavaScript testing framework.

My Contributions

Committed a change that was mentioned in a bug report regarding a test's flaky behaviour: https://github.com/jestjs/jest/pull/15517

My Learning Record

The bug fix itself was simple, but when it came to adding a test to cover the fix, things became more complicated. Eventually, a maintainer and I agreed that the fix was too inconsistent to cover with a test, but I still learnt some things while investigating! Unsurprisingly, Jest uses Jest to test itself - and I became more familiar with this testing framework (that is being proposed to be added to RepoSense!).

The contribution workflow to Jest itself was slightly different to RepoSense as well - there is no custom commit message; rather, the name of the PR is the commit message. Additionally, all changes are logged in a separate changelog file that is included with new Jest releases. This progressive workflow where each change is logged as it is committed could be useful to RepoSense as compared to what we currently do.

Project: fastify

Node.js based web framework focused on speed.

My Contributions

Documentation change: https://github.com/fastify/fastify/pull/5988

My Learning Record

This was an interesting change because what the user thought was a bug was actually intended behaviour - which is why they wanted a documentation change to clarify this. When investigating the code that causes this behaviour, I agreed that it could be easily mistaken for buggy behaviour, highlighting the importance of good documentation!

One contribution workflow we could adopt is adding a PR checklist to make sure most bases are covered by contributors before they submit a PR.

POON YIP HANG, RYAN

Project: typescript-eslint

Monorepo for the tooling that enables ESLint and Prettier to support TypeScript

My Contributions

Fixed a bug where declare variables in definition files were incorrectly flagged as shadowing global variables.

My Learning Record

Learnt more about the no-shadow rule of the typescript-eslint ruleset via the documentation.

Comparisons to internal project RepoSense: typescript-eslint:

  1. PR Checklist
  2. No custom commit message on merge to master
  3. Accepting PRs tag
  4. Issues and PRs are prefixed with words like "Enhancement", "Docs", "Bug" for issues, "chore", "fix", "feat", "test" for PRs.
  5. Presence of Semantic Breaking Change PR CI action.

Recommendations for internal project:

  1. We should consider using the PR checklist as a reminder for contributors to follow the conventions. Sometimes, steps may be missed by accident and the checklist is a great way to keep track.
  2. We could consider using some sort of accepting-prs or help-wanted label for our issues. Currently, a lot of our issues are suggestions/ongoing discussions, and it is difficult for contributors to determine which issue should be worked on. Having a label would make these issues easier to find and more clearly conveys its status.
  3. If we would like to increase the frequency of our releases, having some sort of CI action to detect breaking changes would make it much easier to filter PRs with breaking changes.

TEAMMATES

DOMINIC BERZIN CHUA WAY GIN

Assimp

Assimp stands for Asset Importer Library, which handles geometric scenes from various 3D-data formats, for example animations or texture data, and also supports CAD/3D printing formats.

My contributions

Through my FYP, which uses an AI Agent to automatically remediate security vulnerabilities, a PR for a vulnerability detected by OSS-Fuzz was merged in, the pre-print is in the PR description.

My Learning Record

  • OSS-Fuzz is a large-scale infrastructure for fuzzing open-source projects. They also automatically publicly disclose discovered bugs beyond a certain grace period, for e.g. this was the original issue
  • Assimp, in particular, also has a fuzzing workflow to run fuzzing for 300 seconds, as part of their CI/Quality gates.
  • This project also has SonarQubeCloud integrated, which runs static analysis to serve as a quality gate.

TLDR

TLDR is a community-driven summary page of various popular command-line utilities, across various devices and languages, so you don't need to Google/ChatGPT various commands or run an entire --help/man command

My Contributions

Added a page for aws s3 sync, a tool I was conveniently using on another project

My Learning Record

  • Various commands for command-line utilities on Windows (typically used to OSX/Linux)
  • TLDRs of tons of other various CLI tools, e.g. I didn't know gdrive had a command line
  • (Inspired by Jason) Translations are all done manually, in a separate directory but different repo, but based on the English variant (i.e. to ensure same commands are in the TLDR)

Qiu Jiasheng, Jason

react.dev

react.dev is the official documentation website for React.

Contributions

Lessons Learned

  • react.dev supports multiple languages!
    • Backstory: I ran into the Simplified Chinese site on accident when looking through the list of issues. #7447 caught my eye because it was written in Chinese.
  • i18n, or internationalization, is the design of adapting software to different languages and regions
    • Backstory: Ironically, react.dev does not use i18n to support translations. Instead, they maintain separate repositories for each language, which confused me because I felt that it was extremely inefficient. Curious to find out the proper way of supporting translations, I asked good ol' ChatGPT and was introduced to the world of i18n.
    • i18next: Powerful i18n framework integrated with many frontend frameworks
    • Helped to inspire my lightning talk!
  • react.dev maintains separate repositories for each language
    • e.g., reactjs/zh-hans.react.dev for Simplified Chinese
    • Pros:
      • Each language community has autonomy to maintain and update versions
      • Each repository can have its own maintainers, who understand the linguistic and cultural nuances
      • Complex and heavy content can be best tailored for each language.
    • Cons:
      • Tracking and updates required to synchronize each repository with main English content
  • The Simplified Chinese repository subscribes to docschina-bot for synchronizing changes from the English documentation via weekly automated PRs
  • Learned more Chinese technical terms used in web development

Comparison with TEAMMATES

  • react.dev uses a similar forking workflow for contributions
  • All react.dev contributors must sign a Contributor License Agreement (CLA), which is checked via a GitHub bot
  • Not sure if I saw this on react.dev, but some GitHub bots automatically hide their comments after requested changes are made to reduce clutter in PRs
    • Suggestion: TEAMMATES has a GitHub bot that comments on PR titles and descriptions. We can modify the bot's script to hide comments after the author fixes the issues. This is a low priority task though, since the bot comments at most once in a PR.

XENOS FIORENZO ANONG

Project: Foo

Give an intro to the project here ...

My Contributions

Give a description of your contributions, including links to relevant PRs

My Learning Record

Give tools/technologies you learned here. Include resources you used, and a brief summary of the resource.