Observations from External Projects

CATcher:

MarkBind:

RepoSense:

TEAMMATES:

CATcher

KANG SU MIN

Project: Pandas

Pandas is a popular Python-based software library for data analytics.

  1. Contributing Guide for Pandas
  2. Development Worflow

Summary of Contributions

  1. PR #51384: Write test for cut() function
  2. PR #51356: Fix errors in doctests
  3. PR #51389: Improve examples for documentation & fix ‘undefined variable' errors
  4. Issue #51377: flake8 flagging template strings in docstrings as syntax errors

My Learning Record

Pytest

Pytest is a Python testing framework which encompasses unit tests, integration tests, end-to-end tests, and functional tests.

Pandas project uses pytest to test their software. For PR #51384, I had to learn how to write a simple Pytest test to test the functionality of the ‘cut()’ function. I read the documentation to gain basic knowledge about its syntax, and looked through the tests folder under pandas project to gain a deeper understanding of how tests are written and managed in pandas.

Doctests

Doctests refer to chunks of text that are modelled after interactive Python sessions. Although they are embedded in text, they can be executed just like any other Python code snippets.

In pandas, doctests are mainly included in documentation to illustrate the functionality and features of different functions and data structures.

For PR #51389, I was involved with resolving ‘undefined variable’ errors in doctests, as well as improving existing examples in the documentation. With the help of flake8 (a style guide enforcement tool), as well as the doctests library documentation, I was able to pick out errors in the doctests and fix them.

Differences in Processes

  1. Core developers are each in charge of a certain aspect of the project e.g. Tests, documentation for Pull Request (PR) Reviews

Thus, the developers are able to show more depth in their area of expertise, which allowes new developers to engage in more meaningful conversations with regards to the issue they are working on. Due to their specialisation in reviewing PRs in a certain aspect, they are able to link problems to other existing related issues right away and provide more insight to new developers.

In addition, through PR reviews, the core developers also actively encourage new developers to investigate issues further and participate in the project in other ways. This way, new developers feel more committed to the project.

For example, while working on PR #51356, I detected an error in the way flake8 (a style guide enforcement tool) was processing template strings in doctests. I pursued the issue further and with my findings, the core developers requested that I open issue #51377 for it.

  1. Use of GitHub bots to streamline processes e.g. assigning developers to issues Due to the large scale of the projects, it is hard for the core developers to assign people to issues manually. Hence, they employ github-actions bot that assigns the issue automatically to the account that posts “take” under comments.

This helps to streamline processes and allow developers to start on development tasks faster.

  1. Adding / Updating tests right after bug fixes

After bug fixes are merged, relevant issues are still kept open with a new tag “Needs tests”.

The test added is specific to the bug fixed, such that in the future, it will be able to catch the same bug and prevent possible regression.

  1. Large number of checks for Continuous Integration (CI)

As compared to 4 set-up checks we have set up under CATcher’s GitHub Actions Continuous Integration process, pandas project has 39 checks in place. This is helpful in maintaining the codebase with many developers committing code often.

However, I also noticed that the presence of many checks often causes random failures. In addition, the checks take hours to complete. This could negatively affect efficiency when merging code.

Suggestions for Internal Project

Three main suggestions/tools to be adopted for CATcher would be:

  1. Senior developers building expertise in a particular aspect of CATcher

Right now, senior developers review any PR they come across without being in charge of any particular area of expertise. Sometimes, this could lead to similar issues / PRs being responded by different senior developers, which causes ineffective knowledge sharing and possible miscommunication.

The CATcher team could make it a point to divide up the PRs into different aspects (i.e. by adding tags such as ‘tests’ or ‘documentation’) and each take responsibility over reviewing PRs in that particular aspect.

This would allow senior developers to build expertise, and also impart knowledge to junior developers more effectively.

  1. Making it a habit to add tests right after bug fixes or tweaks in functionality

Tests are not prioritised in the CATcher codebase although the maintainability of our application is crucial due to its usage during examinations.

The CATcher team should make it a habit to add or improve on tests right after bug fixes or tweaks in features to keep our codebase robust.

  1. Adopting GitHub bots for assigning new developers to issues

This would reduce the steps where a new contributor needs to request and wait for approval before getting assigned. It could also encourage more students to start contributing.

Lee Chun Wei

Project: httpexpect

According to the project readme, httpexpect is for "concise, declarative, and easy to use end-to-end HTTP and REST API testing for Go (golang)". It allows users to incrementally build HTTP requests to test on a chain, and then inspect the response payload recursively. It also supports testing of WebSockets. This project has about 2.1k stars as of March 2023, even though there are not many PRs and issues now.

My Contributions

I made a total of 3 pull requests.

  1. PR 230: Implement usage check for Expect in Request and Websocket. This PR fixes issue 162. Before this PR, there was no checks if the user of this testing library called Expect() more than once for a single http request, or if the user tries to edit the request after calling Expect(). This should not be allowed, because calling Expect() would mean that the http request would be sent. This PR will fix this problem by explicitly checking that users do not violate such rules. The PR also adds a sentence in the documentation that warns users about these rules. Unit tests are also added to ensure the correctness of my new code.
  2. PR 266: Add lazy reading of response body. This PR fixes issue 244. This PR is important for the implementation of EventSource support (a.k.a. Event Stream, a.k.a. Server-Sent Events). Before the PR, the response body is read entirely and closed during the creation of a Response. This PR will ensure that the entire body is not read inside the constructor, but instead read it only on demand when it is needed for the first time. New tests are added and existing tests are changed to align with the new changes. No documentation changes are needed because it does not affect the behaviour of Response for now.
  3. PR 342: Support infinite responses in bodyWrapper. This PR fixes issue 245. This PR is also important for the implementation of EventSource support. It could be seen as a continuation from the previous PR I made. bodyWrapper is an internal wrapper that is used by the Response to enable reading the response body several times. Before the PR, when bodyWrapper is constructed, it will read the entire body and cache it into memory, and close the reader. This PR will change the behaviour of the bodyWrapper such that it will read bytes from the reader only when some function is called to read a portion of the body. The content is then cached into a Go slice (a resizable array). To support infinite-length responses, it is possible to disable caching of the body in memory. This PR made big changes, including to the behaviour of functions like Rewind() and GetBody() in bodyWrapper, so the tests for bodyWrapper have to be changed significantly. Similar to PR 266, no documentation changes are needed because it does not affect any external behaviour.

My Learning Record

  • End-to-end testing can be done by using this library. Testers can write simple code in Go and run them in order to test their http services from the client's perspective.
  • Go is the language used in this project. It is a simple language to learn, and it is used widely to write server code. It also has built-in concurrency support.
  • The code review by Victor Gaydov (@gavv), the maintainer of the repository, was fast and very meticulous. I was impressed by the quality of the PR reviews. I think that maintainers of open-source projects, including NUS-OSS projects like CATcher/WATcher can do these to become a better code reviewer:
    • Be quick to respond. Gaydov typically responds within 2 - 3 days after I open the PR or request his review.
    • Check the code style. Even if the code works, it should be easily understood by other readers and consistent with existing coding practices.
    • Be clear in the comments. Gaydov writes detailed and clear review comments on problematic lines of code.
    • Use tags like ready for review or needs revision to easily sort PRs based on their status.
    • Use github bots to check for stale PRs or PRs that needs a rebase (because of merge conflicts with the master branch)
    • Enforce a high code coverage. Make sure every line is tested.
  • A good documentation would make a project more user-friendly. httpexpect is well documented, with very clear examples of usage and friendly comments within the code. This includes documentation for developers, as seen in HACKING.md. Because of good documentation, I found it easy to get started working on this project. I think CATcher, and especially WATcher, can benefit from improved documentation.
  • Tests are very important when writing code. Gaydov always enforces the maximum code coverage possible, with the current code coverage being 95%. In contrast, CATcher's code coverage is a mere 54%, and most of the time, we do not ask for tests in PRs, or delay it to a future PR. Even worse, WATcher does not have a code coverage checker and there are not many tests in it either. I believe CATcher/WATcher can do better and write more tests to improve code coverage and reduce bugs, though I concede that writing tests for Angular frontend is not as straightforward as writing tests for a http end-to-end testing library in Go.

MarkBind

KOH RAYSON

Project: Ockam

Ockam is a suite of open source programming libraries and command line tools that handles end-to-end encryption, mutual authentication, key management, credential management, and authorization policy enforcement.

Modern applications are distributed and have an unwieldy number of interconnections that must trustfully exchange data. To trust data-in-motion, applications need end-to-end guarantees of data authenticity, integrity, and confidentiality. To be private and secure by-design, applications must have granular control over every trust and access decision. Ockam allows the app developer to add these controls and guarantees to any application.

As of March 2023, Ockam has a total of 3K stars of Github, 203 OSS contributors (including me) and 272K downloads on Crate (Rust's Package Registry).

Project website

External Project Workflow

The workflow for contributing to Ockam is pretty standard as far as open-source projects goes. A few things that I noticed that were really great were that the project maintainers were very helpful and the PRs were being reviewed quickly (often in less than 1 week's time).

  1. Find an issue to work on, ideally issues that are tagged good-first-issues.
  2. Work on fixing the bug
    • If there are any setup issues, the Ockam team has set up a discussion forum to help troubleshoot any issues.
  3. Accept the Ockam Contributor License Agreement
  4. Craft proper commit messages.
    • Each commit should have a type and scope. It should be organized as type(scope): <subject>. For example feat(rust): ... or refactor(elixer): ....
  5. Work with PR reviewer to get the PR approved.

Resources:

My Contributions

I contributed mainly to the enhancement of the Ockam CLI.

Merged PRs:

In addition to all the merged PRs, I'm pleased to say that I have been recognized by the team at Ockam for my contributions during one of their release 🎉!

My Learning Record

How to improve visibility of the project

While Ockam as a project only started in 2021, it has now garnered over 3k stars on Github and hundreds of contributors have previously contributed to it. I think one of the main things that I saw the team at Ockam doing to improve the visibility of the project is to maintain a healthy supply of good-first-issues issues.

This is because many sites such as https://goodfirstissue.dev/ and https://goodfirstissues.com/ essentially scrapes Github repos for good-first-issues tag and highlights those projects with a higher number of good-first-issues.

Rust

Ockam's codebase is mainly written in Rust as it is more secure over other systems languages such as C++, while still retaining much of the performance benefits of being a low-level systems language which is important in the context of cryptographic operations.

The main resource that I used for learning Rust is the official Rust website. There's a more comprehensive textbook that I occasionally referred to when investigating certain semantics of the language. In particular, as someone who is very into Programming Languages and Compilers, I really appreciated Rust's approach to safe memory management through what they call the Ownership System. This is beautifully explained in Chapter 4 of the textbook.

Cargo

Cargo is Rust's build system and package manager. Most Rust projects use this tool because Cargo handles a lot of tasks for them, such as building code, downloading the libraries their code depends on, and building those libraries. Ockam also uses Cargo.

I really appreciated the fact that Cargo is shipped together with Rust as a bundle, since other similar languages such as C++ do not and it led to a lot of pain in finding the right build tool and package manager. The main resource that I used can be found in the Rust textbook.

End-to-end Encryption

This topic is relevant to my project since one of the main uses of Ockam is to allow app developers to easily introduce end-to-end encryption to their project.

End-to-end encryption is a security method that basically ensures that only the sender and the receiver of a message are able to read the message. This means that any third-party intermediaries that the message passes through will not be able to read that message. In particular, it will mean that the government / relevant authorities will not be able to read that message.

End-to-end encryption has become a really hot topic in security recently due to the greater awareness and focus on the topic of user privacy.

Some relevant resources on end-to-end encryption:

Recommendations for Markbind Project

  1. Ensure that there is a healthy supply of good-first-issue issues

Seeing how Ockam was able to greatly improve its visibility by having a lot of good-first-issue issues, it would be good to have a healthy supply of good-first-issue issues for Markbind too. This would involve Markbind developers to leave certain low-hanging fruits for new contributors to tackle.

  1. Use Github Actions liberally to enforce checks

Ockam uses a total of 17 Github Actions workflows to check various things when a PR is submitted such as style checks, commit message style check, ensuring that new tests were added and the tests passes, etc. I think there is still room to improve for Markbind in terms of using Github Actions to make certain checks. In particular, I think it would be helpful to have a workflow to check that commit messages follow the proper convention, since that is a pretty common issue that keeps coming up when reviewing PRs.

  1. Tagging commits with the issue type

Ockam enforces a convention for commits where the commit has to be tagged with a relevant issue type such as feat for feature, fix for bug fix, refactor for refactor and so on. I think this is a good way of organizing commits since one would know at a glance, what's the purpose of that commit. In addition, it would be easier when making a release since the commits are already properly tagged.

LIU YONGLIANG

Project 1: MDN Web Docs

MDN Web Docs is an open-source, collaborative project that documents web technologies including CSS, HTML, JavaScript, and Web APIs. Alongside detailed reference documentation, we provide extensive learning resources for students and beginners getting started with web development.

My Contributions

As MDN Web Docs is an educational resource, my main contribution focus has been on

  • researching and replying to issues to help developers with their problems
  • improving the clarity of existing documentation by linking, cross-refencing official specification, adding examples, and rewriting confusing sections
  • fixing documentation bugs (e.g. typos, broken links, incorrect outputs in interactive examples)

Small-scale

As part of getting familiar with the MDN Web Docs workflow, I have made several (10+) PRs fixing small-scale issues. Most of such work has been done at the start of my contribution period, and I have since kept them to a minimum on a weekly basis, to explore more complex issues.

Selected PRs:

Medium-to-large scale

This includes PRs that are more complex and require more research and effort to complete.

Selected PRs:

Selected Issues:

Summary

Complete list of

  • issues I have opened or investigated
  • PRs I have made and merged

My Learning Record

Reflection

Tools/technologies I learned:

  • All things web related (HTML, CSS, JavaScript, DOM, HTTP, etc.)

It's also a first for me to actually read the HTML and CSS formal specifications, and I have to say being precise and defining standards is not easy!

Resources:

Project Workflow

It's quite easy to start contributing to the project, as it can be done entirely on GitHub. With the use of Markdown, it is also easy to make simple changes to the documentation. The general workflow goes as follows:

  • Editing files and tracking changes in git
  • Creating a pull request
  • Check the preview of the changes
  • Get the PR reviewed and merged by one of the owners

Contributing Guide

Lessons Learned

What can be adopted by MarkBind

  • Automatic PR flaw detection (e.g. broken links, typos, etc.)
  • Establishing code owners and auto-assigning reviewers
  • A dedicated documentation dashboard for writing documentation

Suggested areas of improvement for the external project

I think the project has merit in its own right, especially given the number of page views it gets. However, I think there are indeed areas of improvement that can be made. For example, the search functionality and UX is not as good as it could be. While the search input box gives immediate results in the form of a dropdown, if the search term is not found, the user will have to go to the dedicated search page, which shows a list of search results in plain text. This feels awkward, and I think I would be forced to search for the same term on Google instead. Another thing for improvement is the sidebar. The left sidebar of pages can be quite long and at times, it is not clear how the pages are structured. I think better categorization of the items in the sidebar would help.

Project 2: Dendron

Dendron is an open-source, local-first, markdown-based, note-taking tool. It's a personal knowledge management solution (PKM) built specifically for developers and integrates natively with IDEs like VS Code and VSCodium.

My Contributions

A list of some of my involvements in the project:

I also participated in the project Discord server and helped answer questions from new users for a short period of time.

My Learning Record

Reflection

Tools/technologies I learned:

  • VS Code extension development

Even though I did not dive deep into the codebase and contribute further (due to the fact that the project team decided to pivot to a different direction as the tool did not get to product-market fit), I did learn a lot about the project. I am impressed by the amount of work that has been put into the project, and the documentation is very well-written and detailed. I also adopted the tool for my own personal use since then.

I think the silver lining of this experience is that I have a better understanding of how VS Code extensions work, and I am now more aware of what it takes to build and maintain a large-scale project. For example, some of the events that the project team holds are quite interesting and perhaps we should consider doing something similar for our projects:

  • Dendron Greenhouse
    • In Greenhouse talks, Dendron community members share the fruits of their learning. This may include showcasing workflows, tooling setups, systems, and other topics in personal knowledge management, but also anything that the speaker has in-depth knowledge of that may be of interest to the wider community.

  • New User Tuesdays
    • The Dendron team highlights commonly used features and open the floor to community Q&A in the Dendron Discord.

  • CROP Event
    • A CROP (Community Request ) is an issue that is submitted and voted on by the community.

Resources:

Project Workflow

The project is well-documented, with a dedicated developer guide and details on how to get started contributing. To highlight some of the useful inclusions in the developer guide:

  • RFCs (Request for Comments) for major changes
  • Package level architecture, development guidelines, cookbook, etc.
  • References, FAQs, and troubleshooting

Contributing Guide

ONG JUN XIONG

Project: Tachiyomi

Tachiyomi is a free and open-source manga reader application for Android devices. It allows users to read manga from various sources, including popular websites like Mangadex, MangaPark, and Kissmanga, and also supports importing manga files from local storage. Tachiyomi provides a clean and customizable user interface and offers features like automatic updates, tracking reading progress, and support for multiple languages. Additionally, Tachiyomi offers extensions that enable users to access manga from additional sources and offers customization options such as dark mode, custom reading settings, and more.

https://github.com/tachiyomiorg/tachiyomi/pulls?page=1&q=is:pr+author:Two-Ai

Motivation

I have been using Tachiyomi for at least 8 years now and I really like the app. Hence wanted to contribute back to the project. I also wanted to learn more about andriod development and kotlin.

My Contributions

As Tachiyomi is a manga reader that is fairly full featured, my main contributions where on bug fixes and code refactoring. I used an alternate github account just for this task as I didn't want my identity to get exposed (especially since this app is used by so many people and heavily forked).

Github account used: https://github.com/Two-Ai

I started contributions before starting CS3282, as I thought it would take alot longer to get my PRs merged, but the dev team was surprisingly fast at merging PRs. I started by fixing some bugs that I encountered while using the app. I also did some code refactoring to make the code more readable and easier to maintain.

I made roughly 30 PR's in the period from December 2022 to March of 2023. Most of my contributions were focused on small fixes in the download logic of the app, with some medium sized PRs which i will go into detail below.

Medium sized PR

Inline DownloadQueue into Downloader

One of my larger refactors which focused on moving the queue state into the downloader.

Simplify filter logic

In this PR I simplified the logic for filtering manga. This reduced the complexity of the code by quite a bit.

Make DownloadManager the sole entry point for DownloadService

The PR proposed making the DownloadManager the sole entry point for the DownloadService, which improves the codebase in several ways. It provides a clear structure for the Downloader system, simplifies interactions between classes, reduces code duplication, avoids race conditions, and improves accessibility by exposing the Downloader interface to DownloadService without exposing the full Downloader in DownloadManager. These changes make the system easier to understand, modify, and maintain while reducing the risk of bugs caused by concurrent access to the system.

Complete list of

  • issues I have opened or investigated
  • PRs I have made and merged

My Learning Record

Reflection

Tools/technologies I learned:

  • All things android development related (kotlin, android studio, gradle, etc)
  • Git submodules

Skills learned:

Planning: I have learned the importance of planning when working on a complex codebase. Planning helps to identify potential issues and ensure that the changes made to the codebase will improve its structure and maintainability.

Separation of Concerns: I have learned the importance of separating concerns when designing a system. The proposed structure for the Downloader system separates the responsibilities of each class and provides a clear structure for the codebase. This separation of concerns makes the system easier to understand and modify.

Maintainability: I have learned the importance of writing maintainable code. The proposed changes that I've made to the codebase simplify interactions between classes, reduce duplication of code, and make the code more concise and easier to read and understand.

Avoiding Race Conditions: I have learned the importance of avoiding race conditions when working on a concurrent systems. The refactored code avoids race conditions by ensuring that the system state is consistent and by limiting the number of dependencies between classes.

Android Architecture Components: I have learned how to use Android Architecture Components such as LiveData, ViewModel, and Room to build more robust, maintainable, and testable Android applications.

Multithreading: I have learned how to manage multithreading in Android applications, including using AsyncTask, Handler, and Executor to perform long-running operations in the background.

Networking (experimental): I have learned how to use Android's networking libraries, such as Volley and OkHttp, to make network requests and fetch data from web services.

Project Workflow

Tachiyomi's development workflow is quite simple. It uses github issues and pull requests to track bugs and features. The project is also split into multiple repositories, each with their own maintainers. The main repository contains the core code and the UI. The other repositories are for the extensions, which are used to fetch manga from different sources. The repositories are linked together using git submodules.

Most of the development talk actually happens on their discord where users issues and items that the lead devs want to work on will be layed out. The lead devs will then assign the issues to themselves or other contributors. The contributors will then work on the issue and submit a pull request. The pull request will then be reviewed by the lead devs and merged if it is good.

https://github.com/tachiyomiorg/tachiyomi/blob/master/CONTRIBUTING.md

What can be adopted by MarkBind

  • A community for developers to chat (discord)
  • Fast and efficient code review process

Suggested areas of improvement for the external project

I think the setting up and beginners doc should be improved. Also I would like them to give more feedback on the pull requeusts through github instead of having to see my DM's on discord.

Project 2 - devFi

Devfi is a platform that allows developers to earn crypto by contributing to open source projects. We believe that open source projects are the backbone of the software industry and that developers should be rewarded for their contributions.

View it here

This project was developed by me and my friends as we try to give open source developers some incentive to contribute. We plan on making it open source and to launch it as something that web3 developers can use to reward contributors (much like the current bounty systems, but now much easier to use).

How it works

We created a github bot that can be added by any github organization. The said organization can then create a bounty for any issue in their repository. The bounty will be paid out in the form of a crypto token that is created by us. The developer can then claim the bounty by submitting a pull request that fixes the issue. The bot will then verify the pull request and pay out the bounty to the developer.

My Learning Record

Tools/technologies I learned:

  • All things web development and smart contract related (typescript, nextjs, rust, solidity, etc)
  • Github apps and octokit api

Jovyn Tan Li Shyan

N/A - contributed to MarkBind as an internal project

RepoSense

CHAN JUN DA

Project: Wikimedia Commons App

Github: Wikimedia Commons App

Wikimedia Commons is part of the Wikimedia family for non-profit free content that handles uploading, reviewing and sharing of pictures. The app allows users to upload their work directly from their mobile device where they might have taken the photo.

My Contributions

PRs merged:

Issues created:

My Learning Record

Give tools/technologies you learned here. Include resources you used, and a brief summary of the resource.

  • Kotlin - language rising in popularity that runs on the JVM and compatible with Java. It is frequently used in Android app development.
    • Built-in null safety. A nice summary resource I found showing the comparisons between the null safety methods in Kotlin and equivalent ones in Java's Optional.
  • Android Studio - IDE for Android app development. Interface is almost identical to IntelliJ but includes built-in support for device emulator at various Android SDK levels. The debugger is incredibly powerful as you can run the built application on the emulated device and the breakpoints would pause the emulated device itself.
    • Official information can be found at the Android Studio website
    • I found this Stack Overflow answer particularly helpful for anyone new to Android app development. With Android Studio, we have to first download and install the required SDK version followed by a compatible OS in order to get started.
  • Mockito - Java/Kotlin testing-support framework that allows for powerful mocking of dependencies while defining as little code for it as possible.

Reflections

1. Documents about workflow of external project

2. Important things learnt from contributing

The technical knowledge gained has been covered in the previous section. Here, I will cover some of the things I observed from working on a much larger OSS project.

  • There are many outdated/unverified issues, stale PRs, and stale assignees.
    • In particular, I think the problems relating to issues and issue assignees actually makes it more difficult for new contributors to enter as many viable issues are taken up by people who might have given up on it, and these people are also responsible for finding out if an issue is still relevant or not.
    • I think this is a symptom of a very large open source project without a large enough dedicated team to maintain the project as it is mostly volunteer base.
  • Good documentation is quite rare unless actively enforced by the maintainers from the start.
    • Many of the class and methods do not have javadocs documentations, and many that do are not well elaborated or possibly even outdated.
    • There is also very little incentive for contributors to work on improving the documentation as new contributors generally prefer to choose things that are more impactful.

I think these two main things are extremely relevant for RepoSense. While not quite as large or attractive to new OSS contributors, I think lowering the barrier to entry via good maintenance and assignment of issues, and good documentation can go a long way to make a contributor's experience better.

3. Practices/tools of external project that can be adopted by RepoSense

  • Mockito - mocking framework for Java which can be used to add more specialized unit tests for classes that have dependencies that might be better off mocked when running these tests.
    • However, at the moment I believe there is no strong need for such a tool.
  • MVP architecture for GUI applications - might be relevant for a potential GUI for RepoSense.
    • The supposed benefits are that the more decoupled responsibilities of the Presenter (as opposed to the Controller in MVC) makes it much more testable, especially in conjunction with a mocking framework that is able to mock the View and Model interfaces.

4. Suggested areas of improvement for external project

  • Documentation:
    • I think there can be incentives for new contributors to add JavaDocs documentation as part of the 5 necessary PRs before they can work on enhancements (instead of just bug fixes).
    • The software design document is still a work-in-progress and there are no resources regarding the architectural design of the codebase.
    • The two points above compound to make it very difficult to understand what classes are responsible for what part of the application.
  • Quick start guide
    • I believe this can be improved by including a section about installing an SDK version and compatible OS for emulation before building.

HUANG CHENGYU

Project: Checkstyle

Quote from the official documentation

Checkstyle is a development tool to help programmers write Java code that adheres to a coding standard. It automates the process of checking Java code to spare humans of this boring (but important) task. This makes it ideal for projects that want to enforce a coding standard.

RepoSense uses Checkstyle to enforce its Java coding standard. The detailed configuration is in checkstyle.xml.

My Contributions

My contributions are mainly on enhancing the existing documentation of Checkstyle. To be more specific, after experimenting with the tool, I added a significant number of new examples to document the various usage of JavadocType, a check on Javadoc of definitions of types such as Interface, class, and enum. Here is the list of examples that I added.

  1. 1 example for the default check
  2. 1 example for the usage of the scope property
  3. 1 example for the usage of the authorFormat property
  4. 1 example for the usage of the versionFormat property
  5. 1 example for the combined usage of the scope and excludeScope properties
  6. 1 example for the usage of the allowMissingParamTags property
  7. 1 example for the usage of the allowUnknownTags property
  8. 1 example for the usage of the allowedAnnotations property

Pull request: Issue #7601: Add examples for JavadocType #12736; Issue: Update doc for JavadocType #7601

My Learning Record

Checkstyle

Checkstyle is a tool that helps to enforce Java coding standard. Through contributing to this project, I learned the usage of the tool as well as its powered enabled by the numerous checks that can be specified in the configuration file.

Resources:

Maven

Maven is the management tool for Java-based project, and it is used by Checkstyle. Maven provides support for project build, dependency maintenance, and continuous integration. I had to install Maven during initial setup of Checkstyle and build the project using mvn install. Additionally, I need to use commands such as mvn clean verify to verify whether the CI will pass and mvn clean site -Pno-validations to build the documentation site for preview when adding new changes. This also motivated me to learn about Maven along the way.

Resources:

Reflection

External Project Workflow

Contribution Guide; Development Workflow; Pull request template; Pull request rules

Workflow:

  1. Create a fork of the repository, clone it locally and initialize the project
  2. Select an issue that has an approved label
  3. Create and switch to the new branch
  4. Implement the changes and commit it with git
  5. Push the changes to the fork repository
  6. Repeat step 4 and step 5 until the development is completed
  7. Squash the commits into 1 and force push it to the fork repository
  8. Rebase the feature branch onto the master branch
  9. Run mvn clean verify to ensure that the CI will pass. If there are errors, return to step 4
  10. Push the commit to the fork repository
  11. Start a pull request

Note:

  1. The commit message must be in the format "Issue #Number: Brief single-line message"
  2. The pull request description needs to reference the associated issue if it exists
  3. Most pull requests should contain a single commit to help the review process

What can be adopted by RepoSense

The contribution workflow seems quite strict.

Here are what can be adopted by RepoSense.

  1. Given that certain issues may be proposed by external users, a new approved label can be used to filter the list of relevant issues that are suitable for a pull request.
  2. To help the review process, the contrbutors should squash their commits into a single commit locally, even through RepoSense already adopts the squash merge strategy.
  3. It should be mandatory for the contributors to run backend and frontend test to ensure that CI will pass before commiting to the remote respository. This can save the CI resource.

Suggested areas of enhancement for the external project

Checkstyle, as an open source project, is maintained relatively well, despite its large community. Most of the issues and pull requests are well formatted, thanks to its comprehensive contribution guideline. However, I noticed that its main documentation and Java API are from separate sources, although a significant part of the content in the API is a duplicate of that in the main documentaiton. A possible suggestion will be to centralize API related documentation in order to prevent inconsistency and reduce maintenance cost.

Project: MDN Web Docs

Quote from Wikipedia

MDN Web Docs, previously Mozilla Developer Network and formerly Mozilla Developer Center, is a documentation repository and learning resource for web developers.

My Contributions

My contributions are mainly on enhancing the documentation related to HTML, ARIA, and JavaScript.

  1. Pull request: Grammar fix in TextDecoder #24348
  2. Pull request: Add a more detailed explanation of boolean in glossary #24350; Issue: Boolean definition in HTML #24085
  3. Pull request: Adjust the description for srcset of element #24420; Issue: srcset description for element is incorrect and misleading. #22820

My Learning Record

HTML

Through working on Adjust the description for srcset of element #24420, I learned the usage of srcset for image rendering on a HTML page.

Resources:

ARIA

Through working on Add a more detailed explanation of boolean in glossary #24350, I learned the definition of ARIA and how its enumarted attributes works.

Resources:

Reflection

External Project Workflow

Contribution Guide

  1. Create a fork of the repository needs to be created before contributing
  2. Make and commit the changes by taking either of the following steps
    • Simple changes involving a single file: Edit the source file directly on GitHub UI of the upstream repository and then commit the changes to a feature branch of the fork repository
    • Complicated changes: Make the changes on a local cloned repository and the commit it to the fork repository.
  3. Preview the changes locally using yarn start. If changes need to be made, go back to step 2
  4. Start a pull request

What can be adopted by RepoSense

The contribution workflow is quite straightforward.

Here are what can be adopted by RepoSense.

  1. The contributors can be encouraged to generate sample report locally before commiting it to the fork repository. This can prevent premature commit triggers unnecessary CI run, and reduce the CI resource consumption.

Suggested areas of enhancement for the external project

MDN Web Docs has quite a large community. Additionally, the current contribution guideline does not seem to impose too many rules on the commit and pull request standard. Consequently, different issues and pull requests from different contributors may have different styles, which can cause overhead for the reviewers. Therefore, a possible suggestion will be to introduce more rules to standarize the contribution and increase the maintenance efficiency.

TAY YI HSUEN

Project: Foo

Give an intro to the project here ...

My Contributions

Give a description of your contributions, including links to relevant PRs

My Learning Record

Give tools/technologies you learned here. Include resources you used, and a brief summary of the resource.

Zhou Jiahao

Project: NUSMods

NUSMods is the official course catalogue, module search and timetable builder for the National University of Singapore.

My Contributions

Add support for customisable modules (useful for TAs)

My Learning Record

Jest

NUSMods uses Jest for their testing. I have never used Jest before and I was quite surprised by the ease of use of the tool. It is very easy to set up and the documentation is very clear.

Jest runs very fast and requires little set up. It is also very easy to write tests for React components. I was able to write tests for my code in a short amount of time.

The teams enforced a high coverage requirement for PRs. This is a good practice as it ensures that the code is well tested and the code quality is high. This is often neglected in RepoSense and we have to manually open issues for frontend code coverage.

RepoSense uses only Cypress for frontend testing. From my experience and research, Cypress is more suitable for end-to-end testing. This is due to the fact that cypress actually interact with the components in a browser. However, it is not as suitable for unit testing. Jest is a better choice for unit testing as it consumes less time and resources. I would suggest that we use Jest for unit testing and Cypress for end-to-end testing. This will also boost our frontend code coverage and seal up any corner cases that we are unable to test in Cypress due to time/resource limitations.

Redux

While I have some experience working with redux before, I was unaware of the need for a schema migration when updating the structure of a redux store.

In the process of implementing the feature, I needed to change the structure of the redux store. The project ran normally on my local browser as I had no data when I was developing. I was told by the maintainer to include a redux schema migration.

NUSMods loads the persisted data into the redux state in order to maintain students timetable data. Therefore, a change in the redux structure without any workaround will break the data of thousands of active users of NUSMods.

This tool is not applicable to RepoSense as we uses Vuex. As the report is loaded everytime based on the data, no data is persisted through this way. However, it is still good to keep in mind the change of store structure and its effect on different parts of the system.

Suggested Area(s) of Improvement

Stortage of manpower

It is rather surprising that such a popular website is only maintained by 2 developers. The reviewing process is often very long and it could be discouraging to less experienced developers. More contributors could be trained for this role.

TEAMMATES

FANG JUNWEI, SAMUEL

Project: Foo

Give an intro to the project here ...

My Contributions

Give a description of your contributions, including links to relevant PRs

My Learning Record

Give tools/technologies you learned here. Include resources you used, and a brief summary of the resource.

DAO NGOC HIEU

Project: Foo

Give an intro to the project here ...

My Contributions

Give a description of your contributions, including links to relevant PRs

My Learning Record

Give tools/technologies you learned here. Include resources you used, and a brief summary of the resource.

ZHAO JINGJING

Project: Foo

Give an intro to the project here ...

My Contributions

Give a description of your contributions, including links to relevant PRs

My Learning Record

Give tools/technologies you learned here. Include resources you used, and a brief summary of the resource.

WU QIRUI

Project: Foo

Give an intro to the project here ...

My Contributions

Give a description of your contributions, including links to relevant PRs

My Learning Record

Give tools/technologies you learned here. Include resources you used, and a brief summary of the resource.