MarkBind:
PowerPointLabs:
RepoSense:
SE-EDU:
TEAMMATES:
Babel lets you use next-generation JavaScript while still supporting older browsers. Babel will compile your JavaScript into JavaScript that all of your targeted environments support.
These are my merged PRs:
Compiling certain features such as default parameters and for-of loops to 'old' JavaScript that doesn't have such features result in many edge cases that are unaccounted for. The first PR was relatively simple to familiarise myself with how Babel worked. The second PR was a lot more substantial, where I had to rewrite my implementation due to another edge case discovered by my reviewers. In essence, the representation of JavaScript used by Babel didn't allow us to accurately model what was dictated by the JavaScript specification, and so I had to make use of a small hack to achieve the correct result.
Babel's contributing guide lists all the steps needed to build and test Babel in an easy to follow manner. Also, it lists where to get help (their Slack channel), how to submit PRs, and lots of other important information.
Even big projects like Babel can be lacking in documentation. It was difficult to get started with Babel, since the project was so huge and it took many hours of trial and error to figure out where to even start. They had some small guides on their Abstract Syntax Tree and how to write plugins, but that was it. This unfortunately seems to be an issue with many projects as I faced the same difficulty when first starting out with MarkBind. Nonetheless, it would be great for these projects (and other projects too) to have better documentation. Not just so that it would be easier for new contributers like myself to submit PRs, but also for all maintainers to know why a certain thing was written that way a few years down the road.
In contrast with its lacking documentation (i.e. how to implement a feature), it was comparatively easy to know what that feature should do. Babel tries to implement the latest ECMAScript specification, and the specification provides a very detailed explanation of exactly how a language feature (e.g. default parameters) should work. This is why bugs reported in Babel can be determined to be legitimate objectively and quickly: reference the specification and that's it!
This aspect is very different as compared to other compiler projects out there (inclduing MarkBind). MarkBind, React, Vue, etc are all projects where their specification is also a part of the project, and so when a bug is reported with regards to the syntax, the bug could either be a bug in the specification (but implementation correctly follows the specification), a bug in the implementation (does not correctly follow the specification), or maybe even both.
Should MarkBind have a formal specification? It currently uses a mix of several different syntax, so unless a custom MarkBind parser is written we probably have no reason to do so.
The nature of Babel (compiling JavaScript into JavaScript) nicely gives rise to its test suite: One folder for each test, with each folder containing 2 files, input.js
and output.js
. input.js
contains the code to be transpiled, and output.js
contains the expected output. Adding and updating files are therefore really easy. Of course, there are other tests in Babel, but the one I described is the most commonly used.
In a way, Babel is very similar to MarkBind, in that they are both compilers.
MarkBind currently tests Markbind snippets via the use of test sites, where there are a few test sites with many .md
files that make up a full site. Then, the expected HTML output of all these files are stored in a separate folder. This makes it hard to compare the input and output, since location-wise the source MarkBind file and the output HTML file are quite far apart. It might be feasible for us to introduce this method of testing, or even rewrite our current test suite. Then test updates would also be more modular, as all additions would be entirely confined to an new folder, resulting in easier code reviews as well!
You can easily share a code snippet on the Babel playground. You can choose between many different configuration options, plugins, and versions, and also share the link to the exact state of your REPL. This makes it an ideal way to submit bug reports, where anyone can read the issue and then click on the link that reproduces the issue exactly. This would eliminate many "unable to reproduce" bugs in practice, since the reporter can verify bugs on the playground itself before reporting.
MarkBind might be able to implement this to a certain degree, such as self-contained MarkBind syntax that does not reference other files. This would allow easier verification of bugs, since developers can simply click on a link and see the bug in action instead of having to copy the offending code over and then building the site. However, we wouldn't be able to call this a true "MarkBind playground" since it does not include all features of MarkBind.
React-Bootstrap is a open source library of Bootstrap 4 components re-built with React, to be used with React.
Links to documentation of project workflow:
as
prop #5044The use of a sandbox environment with an intergrated editor, particulary via codesandbox.io makes it easier to report bugs. One just needs to paste the corresponding source code, and the rendered output can be seen on the other half of the screen. Bug reporters would also give brief instructions, such as steps for reproduction or state what to focus on. The sandbox environment makes it easy for developers to see what bug is being reporting, and reproduce issues easily and quickly. It seems that for now, sandboxes in sandbox.io can be configured for popular libraries (such as Vue, React, Angular), it would be extremely nice to investigate how to configure one for MarkBind. Be it by talking to the developers of codesandbox.io or by building our own code playground ourselves.
Writing documentation comments on the attributes in javascript comment (in React terms, props
), automatically generates documentation in the form of a table. This is acheived via the use of a function/component they made themselves. This makes the documentation and code always in sync, and also decreasing the number of places to update from the 2 places (the code, and the docs), to just 1 place (the code). This is easy for the project to implement because each React-Bootstrap component is isolated in its own React Javascript file. The closest thing we have similar in MarkBind would be vue-strap components. Unfortunately, it seems more difficult to do the same for MarkBind syntax, especially elements that use markdown-it-attrs
.
The project encouraged Test Driven Development, which provided me a straightforward and pleasant experience when working on an enhancement. First, I add small unit test like the code block below representing my intended behavior. Then, I will code until I pass the test case. In MarkBind, I would have my own test site and add all the components and attributes I want to test for in the markdown file. While coding a feature, I would serve
the test site and visually check if I have implemented my intended behavior. It works but is slightly a more clunky experience, and visual inspection is not always reliable. Why I did not use the existing snapshot-like testing tool MarkBind had, i.e. npm run testwin
, is something worth further reflection/discussion with the MarkBind team. I know for sure, one thing I appreciated in React-Bootstrap, was that their testing framework gave feedback via a delightful UI, showing specific headings stating which component and what behavior is currently being tested. One could consider my observation a small nit, but in my opinion, it goes a long way in encouraging test driven development.
it("accepts as prop", () => {
mount(<FormLabel as="legend">body</FormLabel>).assertSingle("legend");
});
Public communication channels are important for both users and developers of open source libraries. React-Bootstrap has a public Discord as an open channel for contributors, new and old, to come and discuss, or ask quick questions, regarding the project/workflow. I learnt from one of the maintainers in the channel that discussions over the project take place both in Github issues as well as in the Discord channel. To my knowledge, MarkBind has a Slack, but it seems to be invite only. Should MarkBind want to open up to public contributors, it should consider providing and promoting a public channel as well. When it comes to usage of the library, React-Bootstrap directs such questions to Stack Overflow with questions tagged react-bootstrap
, or a different channel in their Discord suited for "How do I use X" type of queries.
Code coverage is a major issue for the project, starting June 11 2019 and almost completed. To organise efforts to increase the coverage of tests, a central hub was created in the form of a pinned Github issue. In it, was a checkbox list of links to a code coverage site (codecov.io) that detailed coverage per file, mostly those of the major components. From that page, interested contributors can indicate their interest to add tests for a specific component, preventing contributors from doing overlapping work.
The use of Prettier reduced the possibiliy of dealing with coding style nits, both as a contributor as well as a reviewer. Speeding the time taken to have a successful PR. Which can and had been suggested for MarkBind before.
These two observations are things worth taking note when a library/project matures. One of the issue raised was regarding adding blog for future releases, which is useful for users of the library to reference to when migrating to a newer release. MarkBind is still relatively young. But in the far future, when we might go to v3, we need to take not of having a point of reference for migrators. Another issue raised asked what organisations are using React-Bootstrap. Even though React-Bootstrap is Used by
198172 repos, even such a widely use library does not know if it's used by any major organisation. Unfortunately, only 2 organisations replied to that issue, which is now buried past the first page of issues.
React-Bootstrap has a repo of examples. Which is useful for users to reference to. Beyond the 2 templates for Markbind, we can consider adding templates showing more in depth usage, or suited to a particular use case. For example, a 'portfolio site' template, or a 'module site' template.
Lack of Triaging is not the end of the world to a project. While exploring projects before React-Bootstrap, I realised that even the good first issue
label isn't exactly useful. If anything, it's rarely utilised effectively/appropriately in most projects I've seen, or very quickly taken up. From my observations, if you have active maintainers commenting on issues, it does alot to encourage new contributors to take up tickets. There are better way to judge whether a project is in good health, such as by seeing that new PRs are merged to close to a day to day basis, as opposed to how colorful their issues are labelled.
Hugo is a static-site generator written in Go. It's most famous for its performance (under one second build time for most websites, claim to be the world's fastest website engine). My personal favorite feature of Hugo is shortcodes, which combines the simplicity and ease-of-use of Markdown and flexibility of templates.
Links to contributor documentations of Hugo
I have two merged PRs for Hugo:
And two merged documentation PRs for MyPy and ZeroMQ respectively:
Hugo's contribution guide is for anyone who's willing to contribute even if he/she doesn't have any experience in Go, GitHub or how Hugo works in specific. The content is written in a tutorial-like fashion with each step explicitly explained. It includes a ton of introductory learning resources for newcomers to refer to and takes different development environments into consideration (Mac/Windows/Linux).
It covers the following key points that are common for any type of contribution:
While it is true that people can easily search for solutions online (e.g. various methods for changing the Git history), including these information in the exact order of build -> test -> format -> cleanup commits would not only make the contributors' life easier, but also avoid many non-standardized pull requests for the reviewers to spend time on. It's a huge waste of time when the PR is generally OK with minor non-compliance here and there.
While GitHub issues can be used as a forum, sometimes it's not ideal for people to create an issue simply because they don't know how certain functionality works (an extreme example is the issue tracker for Flutter. Most of the issues are questions that should be asked on StackOverFlow or issues for third-party plugins). On the other hand, Hugo has its own forum for discussions about community-contributed templates and site-building techniques. Separating questions/complaints/ask-for-helps from bug reports and feature requests is extremely beneficial for large OSS projects, especially those with no company support.
PRs for Hugo must be self-sufficient to be merged, which means bug fixes must come with extended tests, feature implementations must come with inline documentation, user documentation and unit tests.
The maintainer (especially the founder) is rather particular about the naming conventions of the project - which becomes extremely important as the project grows larger and larger as newcomers need to search for file/function names to locate a small area to set breakpoints and investigate on. So long as the functions that do similar things are named similarly, pinpointing related codes would become much easier. However, this approach still doesn't cover the fact that Hugo lacks comprehensive documentation of its architecture and inner workings. And the organization of source code is rather flat (i.e. there are few nested folders to show hierarchy), which makes even more difficult for newbies to grasp an idea of how data flows through the generator.
Hugo has similar audience and use cases as MarkBind, here are some of the insights we can get from observing how Hugo project operates:
As I did some contribution to CATcher, some insights from Hugo applies too:
package.json
to find possible npm
commands anyways. Some introductory materials on how Angular and Electron works could be nice.Text-to-Image is a library for generating an image data URI representing an image containing the text of your choice. It supports a large variety of options, such as font size, font family, color, rotation, custom margins/kerning. The library exposes a single function that parses all those options and converts the supplied text to an image.
The existing implementation is a Promise based async function, which cannot be used in a sync rendering flow, such as in the plugin rendering context of Markbind. After understanding the project code, I was able to cleanly refactor the existing code to separate the core logic from the function entry point, which allowed me to implement a synchronous variant of the same function with minimal duplicated code.
The user facing documentation is very well written. It is clear and easy to understand, containing just enough examples to illustrate its use without being verbose. We could adopt this approach in our internal projects,
This project has a 100% code coverage using Jest. In this case it is feasible and practical as it is a small project, but I think the spirit of it could be reproduced in our internal projects. The comprehensive arsenal of unit tests check for everything imaginable, including an interesting test where a few pixels of the generated image was checked. If we haven't already, Markbind could look into tests that employ a similar screenshotting method to augment our diff checks.
The Source Academy is a gamified platform used in NUS's introductory programming module for Computer Science Freshmen, CS1101S. It is designed to teach students coding in a fun and interactive manner. The Cadet Frontend in particular houses the source code for the frontend written in ReactJS with Redux.
I worked on improvements to the sound library for the Source Academy.
These are my merged PRs:
The sound library suffered heavily from performance issues. For more advanced assignments, students often had to wait upwards of 10 minutes between pressing "run" and hearing the actual sound played.
This was solved by reimplementing sound processing and removing redundant, computationally expensive operations.
Some new features implemented include:
Source Academy developers maintain a wiki page for each sub-project or library which details the features, developer-relevant details and future plans or current limitations for the project.
This makes it easier for newer developers to get an idea for what each sub-project is about, its current state, and which areas need more work.
Unlike PowerPointLabs, the issue tracker for the Source Academy does not automatically generate an issue template for new issues. This could be an area of improvement for them as an issue template would ensure that all the relevant details are included when creating a new issue.
Taking a page from the Source Academy, it would be beneficial to have the developers for each lab to maintain a wiki topic to help new developers get acquianted with their lab.
It can also serve to document future plans or current limitations.
Both Pylint and Checkstyle are static code analysis tools (for Python and Java programs respectively). Both tools are mature and well-known in the communities of their respective languages. Pylint has 370 checks available by default, with a project history spanning 17 years. Checkstyle has 170 checks available by default, and has been under development for 19 years.
Interestingly, though Python is a dynamically typed language, Pylint can perform some type checking as well.
Links to documents for new contributors:
I added 2 new checks to Pylint, and a new command to the Pylint CLI tool. These contributions will be part of Pylint version 2.5.
Links to PRs:
assert
statementisinstance
function
applicationFor Checkstyle, I wrote documentation to illustrate the usage of a check.
Link to PR:
Checkstyle has a comprehensive suite of continuous integration checks, that automates many administrative tasks. For example, their CI scripts can detect the following (amongst others):
With these checks, reviewers need not spend time catching violations of project conventions. They can focus on the actual review instead.
Recommendation for RepoSense
We should try to automate the "admin work" as far as possible. This process has been started as we have been investigating how GitHub actions can be applied (to close stale PRs, for example).
In addition, we can use a git hook to run the linters before every git push
command is executed. This removes the need for reviewers to remind PR authors
that the CI checks are failing due to a code style issue. We could also consider
adopting the automated spell checker used by Checkstyle.
Given that open source projects (like those under NUS OSS) are often developed by volunteers, it's in the interest of projects to attract new members.
Both Pylint and Checkstyle attract a steady stream of new contributors.
One reason is that the maintainers use triaging to higlight areas where new contributors can realistically make meaningful contributions. On Pylint's issue tracker, several interesting (and realistic) ideas for new checks are set aside for new contributors. In Checkstyle's case, maintainers mark out not only "good first issues", but also "good second issues" and "good third issues". Such triaging makes it more likely for new members to invest time on the project, because they see a clear pathway towards making a significant impact.
Recommendation for RepoSense: RepoSense does not have a high number of meaningful first-timer issues. We should consider brainstorming more such issues, especially in the periods when we are expecting new contributors.
Moreover, it can be confusing for new contributors to find a suitable task
after making their first contribution. There could be
an additional Easy
label that identifies work for developers who have
made some first contribution, but are still new to the project. There
shouldn't be any restriction on the number of Easy
issues one can fix.
With the Easy
label and a larger number of first-timer issues, new
contributors are more likely to invest time in RepoSense as they will see a
clear pathway towards a meaningful contribution.
When we implement a feature or fix a bug, there are often multiple possible solutions. Ultimately, we make a decision based on the team's judgement. However, the reasons for these technical decisions can become inaccessible to new members if they are not documented anywhere.
After working on Pylint, I have come to appreciate RepoSense's comprehensive commit messages. It is incredibly convenient that I can understand the evolution of most classes and methods by retrieving the commits associated with them.
Recommendation for RepoSense
Though RepoSense has strict conventions for commits on master
,
there are no conventions enforced for commits in PR branches.
I suggest that we should adopt the use of git rebase
(and not git merge
)
for bringing PR branches up to date with master
.
This will remove a significant amount of "noise" from merge commits in the PR branch.
OpenDota is a fully automated Dota 2 match replay parsing tool that provides the OpenDota API for consumption, which in turn powers the OpenDota UI.
I added 2 new features to OpenDota. 1 for the web UI, and the other for aggregating data to expose through a new endpoint.
I have merged 2 PRs for OpenDota:
Meaningful first issue: A meaningful first issue is important as it is the first guide into the codebase of a project. My first issue in OpenDota was highly educational, as the issue was not contained in a small area. I had to consider multiple parts of the project, such as gaining knowledge of the backend API of the project.
Fast response rate: My reviewers are highly responsive in giving reviews and feedback in my pull request, allowing contributors to quickly iterate on the pull request. Also, they were quick to provide extra help with the project on Discord, such as answering queries in getting started with the project and other relatively basic questions.
Quick setup: OpenDota uses docker with microservices to ensure that a single point of failure would not bring down the entire server. More importantly for me as a new contributor, Docker allowed me to get started quickly, with minor issues that were resolved quickly. Otherwise, there would be a lot of steps required to get the service up, such as manually running several scripts for each service.
My internal project is Reposense, and they are both quite similar in the regard that they both have a frontend and a backend service, just that OpenDota is more complex in nature.
Wider scope for first issues: I personally find the labelled good first issues to be rather self-contained. While it is easy to resolve these type of problems, they don't actually teach the contributor much about the project. I feel that a good first issue should be a stepping board for new contributors to understand more about the project, rather than simply fixing a minor bug with no exposure to the rest of the project. With that said, I don't think the issue should span the entire scope of the project, but should at least cover a sizable amount of the codebase.
Setup public chat channel: I find that while the internal contributors have a slack channel to communicate our problems and ask for help, external contributors do not have this avenue to ask for help. I feel that we can setup a public chat channel to ease external contributors in contributing to the project, as well as answer any queries that our documentations may not cover.
OpenRefine (previously known as Google Refine) is a powerful tool for working with messy data. The tool is capable of cleaning data, transforming it from one format into another, and extending it with web services and external data.
The server-side of OpenRefine is implemented in Java as a single servlet which is executed by the Jetty web server + servlet container. The client-side is implemented in HTML, CSS, Javascript and libraries such as jQuery, Recurser jquery-i18n.
The following are my merged PRs for OpenRefine:
My contributions for the project starts off with mainly bug fixes (PR 1 & 2) to ease my understanding to the large existing codebase. After having a deeper understanding of the existing codebase, I added some new features (PR 3 and 4) that provide more value and simplifies the effort of the user.
Video Hub App is an application that provides a fast way to browse and search for videos on your computer. It is like a YouTube for videos on your computer that allows browsing, searching and previewing.
The application is build with tools such as Angular and Electron.
The following are my merged PRs for Video Hub App:
Submitted Issues:
My contributions for the project mainly consists of fixing and detecting bugs that exist in the application.
Video Hub App is a rather small and new application, with only two active developers working on features while other small bug fixes/features are open to external contributors.
Checkstyle is a tool for checking Java source code for adherence to a Code Standard or set of validation rules.
The link of the workflow of contributing to CheckStyle is here
Below is the workflow for contributing to CheckStyle:
The workflow of CheckStyle is similar to RepoSense. The section listed some major difference between the projects.
Fast Response: The developers of CheckStyle are very active and will ususally respond to PR within one day. Also, they have a very clear workflow by requesting approval in a certain series. The PR will only be merged after obtaining all approvals from requested developers.
Issue Label: They have a very complete set of issue labels. The 'approved' label is used to distinguish issue waiting for PR and issue for discussion. Contributors should only work for issue with the 'approved' label. While Reposense does have the 'todiscuss' label, it does not enforce it. This may cause difficulty for new contributors to find a issue to work on.
CI Checks: CheckStyle has in total 16 checks to pass before the PR can be merged. The tests covers a variety of aspects. They even have CI checks for commit message and number. Also, it requires the code to have 100% coverage using the Jacoco test report. CheckStyle also have a test named pitest
, which is a kind of mutation test. It will modify some part of the code and expect the test to fail. If the code does not pass mutation test, possibly the test case or the main code can be improved.
Regression Report: CheckStyle has its own test tool and each time a new check is implemented, contributor is expected to submit a diff report which can displays the difference in terms of violation reported after introducing this new check. This can be very helpful for checking the effectiveness of the new code as well as determine the regression caused immediately.
For now, it seems RepoSense does not enfore a regression check before merging PR. Since most of our job now is dealing with the frontend, maybe we should have a more complete set of cypress test to detect any possible change in the frontend behavioiur.
We can also have a PR template to remind the new contributor to stick to our guidelines, especially the format the the line length constraints.
In addition, we should enfore a stricter rule in terms of reviewing sequence. Maybe we should even write this into our developer guide so that the new contributor will know who to seek help to. Also some contribution guidelines are not written explicitly in our developer guide such as the coding block for PR message and the 72 character rule.
Oppia is an online learning tool that enables anyone to easily create and share interactive activities (called 'explorations'). These activities simulate a one-on-one conversation with a tutor, making it possible for students to learn by doing while getting feedback.
Links to getting started:
I chose this project as I liked the way they interacted with the first time contributors and their response rate to PRs. They gave equal amount of importance to a first-time contributor's suggestion in comparison to a senior developer's suggestion and this was something I really appreciated.
Working on both RepoSense and Oppia has helped me understand the difference between the scale of the two projects and how a project's organization and management is dependent on the size of the project. It has also helped me draw key comparisons between both and how one rpoject can learn from the other.
Suggestions for RepoSense:
Onboarding new developers: Oppia has an onboarding team which provides first-time contributors with an onboarding mentor whose role is to guide the first-time contributor in his first contribution to the project and help them find the taks most suited to their strength.
Closing of Stale PRs: Oppia has a high response rate to PRs. As a result of this, any PR which hasn't had any activity in 7 days is marked stale
and if there is still no activity in it, it is closed after 4 days. This keeps the issues still open for other developers to take it.
Reviewing of PRs: Oppia assings a senior developer to each PR based on the files being edited by the PR. This makes sure that the contributor keeps on getting feedback whenever they are ready or have a doubt. This also improves the reviewing pace. Once the assigned senior developer approves the PR, other senior developers review it.
Improvements for Oppia:
Improved documentation: Oppia uses a large number of tools to keep its project up and running; but there is not a lot of information regarding what these tools are. It also doesn't provide proper documentation regarding how to run oppia locally. Such a documentation though tedious to write down, will help developers understand what needs to be done locally.
Default data: Oppia doesn't provide default data when developers run it locally. This makes it difficult for first-time contributors to understand the project when they run it locally. RepoSense, on the other hand provides default data which helps the developer understand the purpose of each of the features and the way it is implemented up to some extent.
Source Academy is a website that facilitates the teaching of the CS1101S course in NUS. The programming course is conducted using a programming language called Source, a subscript of Javascript. The website provides the following features:
Source Academy is a small project that encourages experimentation and learning, and hence allows year 1 students to contribute to Source Academy. There is not many requirements in the maintainance of the project, and contributors are free to develop new features without the stress or worry of meeting standards.
In contrast, CATcher is a simpler application that focuses on delivering a service for a crucial assessment in the CS2103/T module.
Source Academy recommends unit tests to be written for new features that are developed.
On the frontend, tests are written with Jest.
On the backend, tests are written with ExUnit.
CATcher can benefit from having more unit tests and integration tests. This ensures that CATcher does not suffer from regression issues if we seek to develop new features or experiment with the application more.
Essentially "Markdown plus", MarkBind is a tool to generate dynamic websites using a Markdown-like syntax. It supports tooltips, icons, search, navigation and more. MarkBind's development workflow can be found here.
D3 is a JavaScript library for manipulating web pages based on data. It is comprised of several modules (d3-array, d3-color, etc.) that can be imported either independently or all at once, and contributions to this project are per-module. D3 has little mention of a development workflow, only a small note in its main wiki page about running a local version of D3 in the browser for development purposes – I used the Node.js console and Observable, a repository of JavaScript notebooks, to develop my code.
The contrast between MarkBind (an NUS-OSS project) and D3 (an external project) provided some insights:
git log --graph
, can with proper management provide as much information as the commit messages themselves. A project's workflow should, as far as possible, keep a linear sequence of commits in each branch, but not necessarily one commit per branch.TMK is a keyboard firmware with useful features for micro-controllers started by @hasu, a keyboard enthusiast in Tokyo, Japan. QMK branched out from TMK to offer support for proprietary keyboards from certain Western companies.
TMK/QMK goes beyond providing basic functionalities but also allow users to customize keymaps. Keymaps allow for fine control over the keyboard where anything from LED light patterns to OLED screen displays can be customized.
I prepared a patch to enable two-way communication between a Lily58 keyboard and the computer, displaying output on the two LED screens attached. This is achieved by having a HID server running upon OS startup to ping the keyboard with an initialization packet to start the exchange of data.
Unfortunately this relies on the new split-keyboard API, a PR which was proposed in July 2019 and just got merged on May 2020.
TMK is sparse in documentation and contribution guides. Potential contributors are recommended to reach out to @hasu before even opening a PR. Otherwise, PRs are simply ignored due to lack of attention and manpower.
As a result, the barrier to even start contributing to TMK is extremely high.
On the other hand, QMK made the decision to allow all contributors to commit their own flavor of keyboard firmwares into the main repository. Not only that, each keyboard firmware contains multiple user submitted keymaps which rely on the API to provided personalized features. Over time, hundreds of firmwares and keymaps have accumulated and every future change must either not break existing code or be changed with permission from the initial contributor. End users will end up pulling unnecessary files too.
This extra burden offers little convenience to end-users at the overhead of slowing down project velocity by orders of magnitude. I would suggest abstracting the drivers of QMK out into its own repository and having manufactures/users import that as a git submodule.
Gatsby is a free and open source framework based on React that helps developers build blazing fast websites and apps. Some of its main selling points are:
Gatsby also provides a set of contributor guides that I found very useful when trying to contribute to the project.
Initially, I familiarized myself with the Gatsby codebase by fixing a bug in the GraphiQL inspector. Once I was familiar with the project, I started to take up larger issues like refactoring gatsby-node
into a set of well-structured utilities categorized by their function. I also contributed to their ongoing TypeScript migration, helping to migrate some of their source files and fixing some incorrect types.
These are some positive things that I've observed as a contributor on Gatsby.
A comprehensive set of procedures
Most open source projects have contributor guidelines that describe how contributors can help out and Gatsby is no different. However, Gatsby maintains the contributor guidelines as part of a larger set of documentation that describes their operating procedures at every step of the process. This set of documentation includes everything from filing/triaging issues and managing pull requests, to documentation and code guidelines, to even non-technical topics such as how one can contribute to the Gatsby community with blog posts, talks, workshops and more.
This comprehensive set of procedures helped me understand their workflow as well as what I was expecting to happen next. For example, by reading the document on managing pull requests, I was able to understand exactly who had the rights to review and merge my PR. This made the process very transparent and clear by removing any sort of guesswork.
A focus on building the community
After my first PR was merged into Gatsby, I was added into the Gatsby organization on GitHub. This caught me by surprise because large open source projects usually only add core contributors into their organizations. I soon realized that Gatsby automatically adds anyone who has merged a PR, and that the organization contained almost 3000 people. Anyone in the organization can review PRs, but only people in the gatsby/core
and gatsby/learning
teams can merge PRs.
By automatically adding all contributors into the organization, Gatsby has managed to build a community around its product. There is an internal discussion board where the core team and contributors talk about ideas on how to improve Gatsby and I think this is a great initiative. Gatsby also uses a Discord channel to facilitate discussions.
Also, as part of the organization, contributors also get to have the Gatsby badge on their GitHub profile which is always nice.
Incentives for contributors
This idea is probably not feasible for all open source projects because it requires some financial capacity but Gatsby has a reward system for contributors. Gatsby has its own merchandise store where they sell some project-related items. For Gatsby contributors, 1 merged PR earns you any $10 item for free, and 5 merged PRs earns you any $26 item.
When my first PR was merged, Gatsbot (Gatsby's GitHub bot) notified me that I qualified for something from their store which was a pleasant surprise. This ties in with Gatsby's overall emphasis of building a community around the product by building a sense of identity through their merchandise. This is also probably good publicity for Gatsby as it increases the likelihood of contributors talking to others about the product.
Using CODEOWNERS
Gatsby has several teams that manage different parts of the project. They use a CODEOWNERS
file to map parts of the project to their respective "owner" teams. By doing so, new PRs opened will automatically have their changed files matched against the CODEOWNERS
file and a review will be automatically requested from the team responsible. This ensures that PRs are reviewed promptly and that no PRs get lost in Gatsby's busy PR tracker.
Since I'm working on the TEAMMATES team, I've come up with some things that could be adapted from Gatsby. For the most part, TEAMMATES already adopts most of what Gatsby does.
If we have the financial means to, perhaps we can consider incentivizing contributors in some way. Gatsby is well-funded so they can offer expensive things like T-shirts and hoodies but there are more affordable options like stickers. From what I've seen in open source projects, people are always happy to receive some kind of acknowledgement for their contributions, even if it's something small like stickers. This would also be good publicity and would increase the visibility of the TEAMMATES project in the long run.
We can consider implementing a CODEOWNERS
file, but it might be slightly challenging because we don't really have clear separation of "owners" in the project. Still worth exploring if we're interested in assigning this type of responsibility in the future.
Hugo is a widely used static site generator written in Go, with over 40.7k stars on Github. It claims to be the fastest static site generator. At its most basic functionality, it can transform Markdown, HTML, CSS files into a static site easily when combined with other tools like Netlify. However, unlike other simpler static site generators, Hugo also boasts complex content management and great built-in templates for users to use.
I took up this project as I wanted to learn more about Golang instead of doing my usual JavaScript stuff.
Allow raw string literals in shortcode params
Hugo has a great feature called shortcodes that allow for Hugo Pages to contain variables. These variables can also take in "parameters" of sort, adding to their overall usefulness. I helped to add support for escaped quotes and Golang raw strings in this PR for such parameters. I found this to be technically challenges as I had to delve into the lexer code and understand how it parsed these parameters out. In fact, the lexer in Hugo is based on a great talk given by one of the Golang coreteam members.
Add hugo.IsProduction shortcut
Hugo offers conditionals to selectively render certain parts of the markdown file. I added a shortcut for a commonly used conditional.
Add support for newline in raw string shortcode
This is a bug fix which helped to add support for newline characters in raw string shortcode parameters.
Like most other open source projects, Hugo works off a fork and branch model. Users fork the Hugo repository and make their changes on their a branch of their fork, before sending the finished pull request back to the main repository. As Hugo is maintained only by a few developers, only one approver is needed before the PR is squashed and merged.
Hugo also does releases fairly often, so it will not take long before a new contribution
I've enjoyed working with the Hugo codebase, in no doubt due to the following few features.
Hugo's creator and chief maintainer, bep, does a great job responding to new issues and PR fixes. All my PRs were reviewed within 3 days. This responsiveness enables contributors to contribute more effectively, as they are able to quickly follow up with fixes while the context of the PR is still fresh in their mind. However, more importantly, Hugo also has their own forum, where its many users can ask questions and get fellow Hugo users, or maintainers to help answer it. The forum is fairly active and is very friendly and welcoming to newcomers, making it an effective way to integrate new users and developers. In fact, Hugo's Github Issue tracker is used primarily for proposing enhancements or formalizing bug reports. This also reduces the burden on maintainers and developers, especially since Hugo is not supported by big companies, unlike projects like React.
All of Hugo's features have great documetation that is updated and easily accessible on its website. This sets it apart from its counterparts as users are able to better understand to power and functionality that Hugo offers, hence increasing adoption. All PRs to Hugo that change functionality must also update the documentation whenever necessary, thus helping it to maintain documentation quality.
Sometimes, when faced with a huge project, I sometimes get intimidated by its size and do not even bother setting up to see if I can contribute to it. However, with Hugo, there is a great contributing guide for newcomers that details how to set up the process from scratch, and even offers help on some basic Git functionality. This thorough guide helps to reduce bugs while setting up and enables more people to contribute. The guide is given above under the Contributions section. This ease of setup is also in part attributed to the nature of Golang, which focuses a lot on simplicity.
However, I have also encountered some difficulties while contributing to this project. In particular, Hugo does not have much code level documentation for developers. The bulk of commits are done by the small team of maintainers, who are very familiar with the codebase. As a new developer, I often had to use the external documentation for users to try and locate, and then understand the code. As Hugo is also a fairly large project, developers who are new to the project may have to follow multiple call stacks in order to figure out where to make their change. Also, some of the variables used are quite similar, which can further confuse the developer. To help ease new developers, one point for improvement could be to maybe insert a few markdown files at each top level folder to explain what the code inside does and the functionality they support. This will go a long way in helping new contributors make changes effectively and quickly.
TEAMMATES and Hugo have similar review processes, including the need for PRs to be approved and for CI checks to pass. I think the key thing TEAMMATES could learn from Hugo is not related to the code, but more of the documentation and community. I think that TEAMMATES does not effectively retain developers, in that people contribute and then leave permanently. When they do so, we lose their expertise and insight. Code that they contributed may also be more difficult to maintain. The same cannot be said about Hugo. Even though it is a few weeks since my last contribution, I still get an automated email from the forum informing me of new posts, thus helping to keep me engaged.
As a school/student-run repository, TEAMMATES may not have the resources or time to setup such a complex forum. However, what we could do could be to setup a simple Telegram Chat, or Slack channel where developers can ask questions quickly without the hassle of raising a Github Issue. This is also important as it seems that many of our one-time contributors are students, and Slack and Telegram are widely used by this demographic. The Slack channel will also enable maintainers to quickly onboard new developers, as we may not check Github that often.
Another thing we could learn would be the excellent documentation. TEAMMATES actually has a good developer guide, however, it is not frequently updated, and a good deal of issues on our Issue Tracker are related to setting up. The documentation for setting up could be reviewed and expanded to cover different OSs and even expand the FAQ section when we spot common bugs occuring. This way, we can get more people to contribute to the project.
ZAP is a flagship project from the Open Web Application Security Project (OWASP) that has become the world's most popular free, open source web security tool. ZAP is actively used by penetration testers and security specialists worldwide, maintained by a dedicated international team of volunteers. Under the hood, ZAP is a complex and mature Java desktop application created for web application security testing with an experienced and dedicated core of developers and maintainers.
I added a button to the HTTP manual resend screen in ZAP core to enable regeneration of the Anti-CSRF token if any were present. This addition was difficult as it involved code from the UI all the way to the HTTP APIs. Extensive research on Cross Site Request Forgery (CSRF) attacks and their protection mechanisms was required to understand how to regenerate anti-CSRF tokens and inject them into the HTTP request to be sent out.
Relevant PRs:
I added a new passive scan rule to ZAP which would allow for the scanning of HTML and javascript responses to look for dangerous JS functions that could leave a web application vulnerable. The implementation for this was particularly difficult as the senior developers suggested I support a custom payload so that users can define their own functions to look for at runtime. On top of learning how HTML/JS is structured for effective scanning, I learned how to utilise a custom extension and manage dependencies in gradle.
Relevant PRs:
ZAP has extensive documentation on contribution and development guidelines. In brief, they follow a forking workflow similar to most OSS projects, and require that PRs made be associated with issues raised. There was also a very useful blog post series outlining in detail how one can start contributing to ZAP.
Relevant Links:
I chose ZAP as my external project so that I could begin to develop expertise in both Java and Computer Security. I learned a lot about functional interfaces, callback patterns, and method forwarding in Java. I also learned in greater depth how UI design and implementation is done in Java swing. One interesting lesson was how to operate multiple repositories in one workspace on IntelliJ. Since my external project had different git repositories interacting with each other, learning how to work with multiple modules using gradle was quite enlightening.
Equally as important was the knowledge I gained about web application security. The contributions I made to ZAP were highly non-trivial and required extensive research into security concepts. I learned a lot about the CSRF attack as well as professional defences against it in great detail. I was also exposed to a lot of application security concepts in breadth and have a much better appreciation for the topic now, with a firm grasp of the jargon involved.
Additionally, one of the most important lessons for me was learning how to ask for help. This was the first time I had to work on such a complex project that I couldn't easily understand how things worked. Learning to ask the right questions and getting over the fact that you just had no idea what was going on was a very humbling experience.
Currently, TEAMMATES requires a review from a "junior" developer followed by a review by a senior developer for PR merging or approval. In that sense, there exists two teams of starkly different authority over the application. The current two-tier structure limits the availability of developers with the capacity to deal with issues like stale PRs or design conflicts, and furthermore restricts the agency of the junior developer which consequently carries the potential to disincentivise them from further involvement.
One interesting practice of the external project that I feel could be adopted by TEAMMATES was the relatively flat maintainer hierarchy approach to resolving PRs and issues. In order to be merged, PRs in my external project required two approvals from the dev team. There only exists the one team, and two approvals from any member of the team suffices for resolving PRs. In that sense, each member of the dev team carried with them equal ownership over the project, which could be a very large motivating factor for continued involvement.
My suggestion is that a similar approach be taken with TEAMMATES, where either the function of the senior dev team gets expanded to allow inclusion of some junior developers or more authority is given to the junior development team (e.g. two reviews from a junior dev is equivalent to one from a senior dev). This helps provide the agency needed to sustain involvement and subsequently grow the application. The success of an open source project relies upon the developer experience, which is directly improved with a larger pool of dedicated maintainers to attend to new contributors and triage issues.
Ionic is the open-source mobile app development framework that makes it easy to build top quality native and progressive web apps with web technologies.
Ionic Framework is essentially a UI toolkit that consists of customized Web Components that achieve the same functionality as native mobile components on iOS and Android and mimic their look and feel. A comprehensive list of components available in Ionic Framework can be found here.
The document outlining the contribution workflow can be found here: CONTRIBUTING.md
Bug fixes:
New feature:
The contributing workflow of Ionic is a standard one similar to other open-source projects. Since Ionic Framework consists of multiple packages (core, angular, react, and vue), all new issues submitted by the contributors are first labelled triage
. One of the core developers then attends to the issue and label the issue with its package and type (bug or feature request).
Here are some observations made while contributing to Ionic:
Use CodePen to reproduce front-end bugs. One advantage of Ionic being a front-end framework that works with VanillaJS, is the ease of testing and presenting the bugs. CodePen functions as an online code editor where developers can create code snippets and test them. To illustrate a bug of a particular Ionic component, contributors can simply paste the code that reproduces the bug in CodePen, and attach the script to Ionic's js
and css
files in the <head>
section of HTML. CodePen then displays the rendered page side-by-side. Any changes made to the code segment also reflect in the rendered page in real time. Code segments on CodePen can be saved and shared with links. Hence, it is recommended to include a link to the code segment that presents the bug for easy verification and assessment of the bug by other contributors.
Use bot created by GitHub App to manage issues. All the issues submitted by contributors will first be labelled triage
automatically by the ionitron-bot
which manages the issues. It highlights the new issues to the core developers so that they are not easily missed and prompts further actions from the core developers, such as categorizing the issue or seeking for clarification. The bot also automates closing and locking issues that are not following the provided issue template, not within the scope (support type / "how to" questions), or with an inactive conversation for weeks. It also automatically provides a message to the contributor while closing the issue, saving the need for core developers to repeat same instructions for many times. With the bot, the core developers can spend their time on more important issues. This can possibly be adopted by TEAMMATES if the number of contributors and issues grows.
Make constant releases. As an open-source front-end framework that is used by many projects, Ionic makes constant releases to deliver bug fixes as soon as possible. The intervals between releases are not fixed; depending on the importance and urgency of bug fixes, the intervals range from two weeks to one day. Moreover, there is a clear distinction between releases involving bug fixes and new features. Patch releases (version number changes by 0.0.1
) only contain bug fixes while minor releases (version number changes by 0.1.0
) contain both bug fixes and new features. Major releases (version number changes by 1.0.0
) contain breaking changes and are preceded by beta releases (X.X.X-beta.X
). These clear distinctions ensure that developers using Ionic will not easily introduce new behaviors or breaking changes while upgrading the framework.
Do screenshot tests. Since Ionic is a front-end framework, it is important that a PR should not introduce changes to a component's look and behavior when it is not supposed to. With the aforementioned bot, screenshot checks are carried out to compare the look of components before and after the changes made in the PR, pixel by pixel. It automatically takes more than one thousand screenshots and highlights to the contributor any mismatched screenshots after comparison. The drawback of this test is that it needs extra computational resources and is very time-consuming. A complete round of screenshot test takes about 8 days to finish. Hence, only PRs made by core developers can trigger screenshot tests.
Netlify CMS is an open source content management system that enables developers to provide editors with a friendly UI and intuitive workflows. It can be used with any static site generator to create faster, more flexible web projects. Content is stored in the developer's Git repository alongside the code for easier versioning, multi-channel publishing, and the option to handle content updates directly in Git.
I chose Netlfiy CMS as my external project as it aims to be a core feature within the JAMStack, an up and coming tech stack that delivers fast and secure sites and apps by pre-rendering files and serving them directly from a CDN, removing the requirement to manage or run web servers.
I helped to add support for different languages within the application and some external widgets using the React Polyglot library. This was particularly difficult as the application uses a single source of truth for the different locales within the core component, but also had to be referenced by the external widgets. I managed to figure out how the widgets were integrated into the core component and subsequently how to pass the references over. As proof that it works, I also added several screenshots and translations to the PRs.
Relevant PRs:
I fixed a bug that resulted in a mismatch between the documentation and the implementation regarding the media library usage for uploadcare
and cloudinary
.
Relevant PRs:
Netlify CMS has documentation page on contribution and development guidelines. They utilise a forking workflow and feature branches. Contributors are required to rebase on master when opening a Pull Request and before merging in a Pull Request. Pull Requests are reviewed by at least two maintainers prior to merging. There is also a community slack channel and forums to allow new developers to chat with maintainers and see what are some of the issues being actively discussed.
The contributing guidelines are well-documented. It includes comprehensive guides for setting up, testing, linting, hot-reloading and making Pull Requests. It was easy for myself as a newcomer to get started with the project. I particularly liked that the Netlify CMS community is very encouraging towards new contributors. To quote parts of their documentation:
Relevant Links:
Netlify CMS has a public Slack group for discussions with dedicated channels for both users of Netlify CMS as well as developers. This makes it easy for new developers to understand the issues that users are facing as well as ongoing discussions within the developer community. New contributors can also join discussions and chat with the maintainers directly if they need help on issues they are working on.
Much like Netlify CMS, TEAMMATES could benefit from having a community slack channel to reach out to a larger pool of new developers. Currently, TEAMMATES relies on the use of 'Good First Issues' for newcomers to contribute the project, but they are not frequently raised. The nature of GitHub issues also may not be conducive for general discussions as compared to the responsive nature of Slack channels. Having a community slack channel allows new developers to approach and speak directly to current maintainers to look for issues or seek help on issues that they are working on.
A good practice that Netlify CMS adopts is to enforce style guidelines on Git commits through the use of pre-commit and pre-push hooks with Husky. This results in a more consistent set of commit messages and makes it manageable for developers to read past changes.
Another common issue I observed in TEAMMATES is that developers sometimes forget to lint or run regressions tests before pushing onto GitHub, thus failing the CI and requiring separate commits to fix these trivial issues. Pre-push hooks could likewise also be implemented to enforce linting and testing before a developer pushes to GitHub.