Knowledge gained from Projects

CATcher:

MarkBind:

RepoSense:

TEAMMATES:

CATcher

Sun Xinyu

Tool/Angular Framework Overview

Without any prior knowledge to Angular, I have quickly gone through the introduction part of TypeScript tutorial and a hands-on practice with Angular by official Angular website to quickly familiarise myself with the framework.

Angular adopts the MVVM (Model-View-ViewModel) architecture, similar to MVC (Model-View-Controller) but with some differences: in MVVM, the ViewModel replaces the role of Controller in MVC and acts as a mediator between the Model and the View. The ViewModel provides data and functionality to the View and the View communicates user actions back to the ViewModel.

Angular also makes use of Components, Services, Directives and RxJS to build an Angular application:

  • Components are the building blocks of an Angular application. Components are responsible for defining the view, handling user input, and interacting with the model.
  • Services are used to share data and functionality across components. They provide a way to centralize common functionality and make it available to the entire application.
  • Directives are used to add behavior to the DOM elements and allow us to manipulate the DOM and add custom functionality to applications.
  • RxJS is a creative programming library to provide a way to work with asynchronous data streams and handle events in a declarative way.

Angular recommends the usage of TypeScript for its forceful type checks to variables as compared to original JavaScript.

The TypeScript tutorial provides very in-depth explanation of the language as well as listing out the notable difference between TypeScript and other common programming languages. It includes great number of details but can be overwhelming to beginners. I briefly looked through the program and wrote some common algorithms in TypeScript to make sure I roughly knew the basic component before proceeding to read about Angular. This resource is better served as a handbook to check when one encounters complex problems related to TypeScript specifically.

The Official Angular Start Guide provides a walk-through of building a shopping website with Angular which involves component, service, data management and transfer -- essentially everything needed for a basic website. It is a fun experience and the guide is very clear and helpful.

Angular Upgrade

In WATcher, I drafted PRs to upgrade Angular version from 8 to 10. While following the instructions by the official guide, I worked with various problems along the way: outdated versions of certain dependency, linting errors, wrong modifier and manual upgrade of certain new imports after merge, and managed to solve them accordingly.

Tool/GitHub Actions + Markbind

During rendering the WATcher-docs website, I read about Github Actions and they would be triggered upon pull requests or change to specified branches. I have also read about markbind documentation to understand how we may deploy a website written with markbind using GitHub Actions.

References:

Li Zhaoqi

Angular and Typescript

Having once worked with Angular and Typescript in one of my personal project, I have some prior experiences with Angular. However, to refresh my memory as well as to familiarise myself with the Angular environment. I frequently referred to Angular documentation online for help.

Both Angular and another popular framework React uses a component-based architecture where developers will develop reusable components, classes as well as directives to assist in building the application. By modularising different aspect of the application into different modules, Angular enables developers to allow each componenet to have its own specialisation as well as to allow easy unit testing on each individual components.

While Angular seems to be very convoluted and heavy compared to React, a lot of its potential and ease of use is actually automated via the Angular CLI, which allows developers create template files for each component they are building.

The benefit of Typescripts is noticeable in the long run. While Javascript has dynamic typings as well as inferred types, in the long run, it is extremely difficult for developers to maintain and manage without proper documentation due to the lack of typings. Typescript essentially fixes this issue by forcefully introducing types to variables as well as enabling better linting compared to the base Javascript. Typescript adds on an additional layer of compilation where Typescript code will be compiled efficiently into Javascript code before it is launched and run. While this seems like a waste of effort, the benefits Typescript provide is very valuable as mentioned before. Besides this, Typescript also enables certain default Macros such as the ?. and ?? operators where the expression is discontinued if the base item is undefined or null.

HTML Text area history events

While working on adding a custom undo and redo stack for CATcher, I learnt about how HTML manages it's build in undo and redo feature in it's text area. Initially it was rather mysterious on when and how data and information is stored in the text area which enables them to do features and like undo and redo out of the box. I discovered how such features will be nullified when the value of the component is manually changed via code. However, when using libraries like the Markdown Toolbar, the history is still preserved.

It turns out that the way these is achieved is via events specified by a specific set of platfomr APIs that is accessible via the Javascript method document.execCommand. In this case, storing of history is done via the Internet Explorer Command Idenfifiers found here. Specifically, users can wrap the changes made to the text-area via 2 commands ms-beginUndoUnit and ms-endUndoUnit. These 2 tags essentially serves as a "mutex" where changes done to the text area during this period is viewed as a singular change instead of multiple changes as well as the changes being stored in the history stack of the default html elements.

Javascript Events

In a lot of the Javascript components involving input output (IO), how events are being propagated is quite important to understand. Specifically, how each event is "bubbled" and "captured" through different components and how to control the flow of such events. In details, suppose a user click on a certain component (let's say a div tag). This component could be located in many different components and tags. So if a user click on that div, how should it's parent component react to that event? There are generally 2 forms of argument on how the behavior should be implemented.

Firstly, it is the capturing of events, where the "click" will be captured from the most outest parent component slowly to the target component that was actually clicked.

Secondly, there is the idea of bubbling, where events will originate from the source of the click (target component) and slowly "bubbles" up to it's parent components.

Javascript generally implements both of this in 2 phases. When some event is going to trigger, it first "captures" all the relevant component and then it creates a event which "bubbles" back up. For more details regarding this behaviors as well as controlling them, I referred to the link here for reference

GraphQL

Traditionally, interations with servers are usually done via REST APIs using POST and GET to retrive information. However, the problem here is the additional information that is unrelated to the what the user is actually want. For instance, if someone wants to query for certain objects on the server, information that is not important is still retrieved due to the fixed API responses. A much more desirable behavior is to supply the users with exactly what the users want in a singular, compact request. This is where GraphQl comes in. GraphQL enables the clients to have the freedom to select data that he or she needs and allow a much faster and fluid development compared to traditional REST.

A specific GraphQL server is setup on the host to expose the data as a GraphQL API to the clients. This server will follow the graphQL requirements and allow clients to freely query for the specific information needed. CATcher and WATcher specifically uses the Github GraphQL server to query for information. Due to a rate limit of page size=100, we still need to make multiple requests fetch all the required information.

Github GraphQL explorer: Explorer

RXJS

A major difference in UI/UX developement is the need to handle interactions with users. Things like waiting for data from servers, updating elements based on changes can be rather difficult and tiring to code out using the asynchronous programming methods. RxJs adds Observers patterns to the code while abstracting away all the intricacies as well as asynchronous code. This enables developers to easily leverage observer patterns while maintaining a clean and readable code.

Wong Chee Hong

Angular Essentials

I had contributed to CATcher as part of IWM, but I have never really approached the Angular aspects of the project.

Essentially, the core ideas behind Angular involves:

  • Components, a TypeScript class with @Component decorator, an HTML template and styles.
    • The decorator accepts parameters that help Angular know which HTML file is the component's template and which css file is the component's styles.
    • The decorator also accepts a parameter that is the component's selector, which is how we can reuse this component as an HTML element in other HTML files.
  • An HTML template that instructs Angular how to render the component
  • An optional set of CSS styles that define the appearance of the template's HTML elements

The other key concepts include event bindings and property binding that link the template to the TypeScript class. Knowing these essentials allowed me to fix WATcher PR#57.

Another key part of Angular is its Dependency Injection system and services. Angular allows us to provide dependencies at different levels of the application, and how the dependencies are instantiated.

  • For example, when you providing a service at the root level, Angular creates a single, shared instance of the service and injects it into any class that asks for it.
  • Also, it seems like most of WATcher and CATcher's services are provided at the root level.

Finally, as part of fixing "Remove label-filter-bar as module export #92", I also learned about how related components are organized and grouped into modules. Each Module are self-contained and provide a certain set of functionality and components related to that module, thereby achieving separation of concerns.

Resources:

E2E Testing with Playwright

After having 2 separate hotfixes pushed in a single semester, I started to look more deeply into ensuring the robustness of our application. During these 2 hotfixes, bugs were only uncovered during manual testing. However, it is time consuming to conduct manual tests, and we need to find a way to automate it. E2E tests simulate user interactions such as clicks and typing and is a useful way to ensure our end-product is performing as expected.

During this semester, one of the high priority issues was to migrate our E2E solution away from Protractor. As such, I have investigated Cypress and Playwright as two potential E2E solutions.

Mocking services

When performing migration from Protractor to Playwright, I learned about the different strategies E2E tests can be conducted. Typically, we would want to conduct E2E tests against our production server, since that is what our end users will be using. However, since CATcher depends alot on GitHub's API for its functionality, we are unable to perform automated tests against GitHub. A second strategy would be to mock the functions that hit GitHub's API, and we would test solely the functionalities and behaviours of the app. This let me realized that there is a test vs production version of CATcher.

I have also looked into whether it is possible to perform E2E testing against the production server, since one of the bugs fixed in the hotfixes can only be caught if we did not adopt a mocking strategy. One of the key feasibility concerns I had with testing against the GitHub API was simulating user authentication. This was because authenticating with GitHub requires multi-factor authentication, something that is difficult to achieve with automated E2E testing. Some potential solutions to bypassing MFA would be to use TOTP, which can be generated programmatically. More research will be needed in this area.

Aspects Learnt

  • Configuring and setting up Playwright for a project.
  • Learned about how Playwright/Cypress/Protractor identifies and interacts with HTML elements using selectors.
  • Learned about how CATcher API calls are mocked during E2E testing

Resources:

Github Actions

I also picked up Github Actions when contributing to the CI/CD pipeline in Enable linting in Github workflow #81. I learned how Github Actions are set up and how they can be triggered upon pushing to main/master and also on pull requests.

Furthermore, I learnt how we can use matrix strategies to run the same job with different parameters, such as different OS and different node versions.

Resources:

RxJS and the Observer Pattern

Part of working with CATcher source code was frequently encountering Observables and Observers. RxJS supports Observers and Observables, allowing updates to some Observable to be received by some Observer that is subscribed to it. With this pattern, we can trigger updates in many dependent objects automatically and asynchronously when some object state changes.

Resources:

Vignesh Sankar Iyer

Angular

Having had experience in mainly React and NodeJS projects earlier, I was overall more used to creating projects with Functional Components, rather than Class Components as with Angular. However, I realised that one of the key aspects of frontend frameworks, namely reactivity, was in fact the main drivers of development of such frameworks in the first place!

In fact, even React were originally championing the idea of Class Components in order to isolate various web components into areas or responsibility, following rule number 1 of Software Engineering: Single Responsibility. However, while React is largely unopinionated in how you structure your code with regards to the coupling of business logic and HTML, Angular differs by dictating where and how you structure your components.

Angular separates components into modules which comprise of 3 to 4 files:

  • Components, which are necessarily TypeScript classes which have the @Component decorator;
  • Templates, which dictate the HTML that is produced and rendered by the component;
  • Styles, which dictate the type of styling to apply to the component.
  • Module, which indicate the modules or services that are to be imported by the component. Interestingly,

On the other hand, React only dictates that class components should produce some sort of HTML using the render function. Even this is removed with the introduction of Functional Components that are simply functions which render and produce some HTML. React introduces hooks which are often used by developers to manage some state at the component level, using functions with side effects.

Each method has its positives and negatives. Because of its opinionated nature, Angular makes it easy to standardize frontend coding standards and pattern across an entire enterprise, making it an apt choice to use as a tool for OSS development. On the other hand, React allows you to develop code more quickly, with more attention needed to be paid at the rendering lifecycles in order to let the Virtual DOM know when a particular component needs to be rendered again. On top of this, Angular wholely separates business logic from rendered HTML, whereas React takes the does not make this distinction.

Another key point is how React and Angular differentiate in providing context (sharing or passing down state between different branches of the DOM tree). React has its own Context API that is used to share some sort of state between different components, whereas Angular does this by the providers declaration in the module folder, which results in a set of singletons that are shared by components that exist below it in the tree.

RxJS

I also picked up RxJS along the way, which was Angular's answer to creating reactive components. RxJS essentially deals with asynchronous pipe/filter, publisher/subscriber behavior which allows values to change and other components or functions to subscribe to these changes. This works considering Angular's Change Detection strategy which I will explain later.

In comparison, React introduced and adopted hooks to encapsulate the behavior of having to rerender. React does this by operating on a Virtual DOM, and appropriately rerendering components and their children in patches when a change was detected. On the other hand, Angular does not have any abstraction to operate and rerender components whose state have changed. Instead, Angular uses a Change Detection Strategy which can be configured by the user (either onPush or Default). Angular Change Detection works by using Zone.js and activating after every async action performed. CD traversal starts at the root component (usually App) and works its way down the component tree updating the DOM as needed. What's happening under the hood is that browser events are registered into Zone.js - Angular's mechanism for orchestrating async events - which emits changes after initial template bindings are created. ...

MarkBind

Elton Goh Jun Hao

Special mention to ChatGPT and GitHub Copilot

A fun fact is that I use ChatGPT and GitHub Copilot for everything in the list below. ChatGPT just makes it so much easy to write and debug code. ChatGPT has really helped me to picked up the technology and tools mentioned below. I find that GitHub Copilot is super helpful when writing boiler plates code and code in general.

Vue.js

During the semester, I learned the fundamentals of Vue.js, including the Vue lifecycle, creating Vue components, and working with both Vue 2 and Vue 3. It was exciting to discover that my previous experience with React was easily transferable to Vue, which helped me to quickly grasp the fundamentals of the framework.

While using Vue, I also realised the importance of having a huge community behind a framework. When working with Vue, I find it harder to find solutions online as there are less resources available compared to React. However, I still had a great time learning Vue and learning a new framework for Frontend.

Resource used:

  • Vue docs: This is the most valuable resource I used to learn Vue. The docs are pretty well written and easy to understand. However, there are some parts that are not very well documented such as SSR.

Monorepo and Monorepo management tools

During the semester, I learned about Monorepo and Monorepo management tools like Lerna, NPM Workspaces, and Turborepo. I gained an understanding of the benefits that Monorepo provides, such as simplified version control, dependency management and sharing of configs. I learned how Monorepo management tools can help with versioning and also help to speed up running test and building through concurrency and caching.

Overall, I gained a greater appreciation for Monorepo and its management tools, and I can see how they can greatly simplify software development and improve efficiency.

Resource used:

  • Lerna docs: This is the main resource I used to learn about Lerna. The docs are pretty well written and easy to understand.
  • Fireship on monorepo: This is a good summary video by Fireship which shares about what exactly is Monorepo and why it is useful.

Serverside Rendering (SSR)

This is my first time actually working with SSR directly. Prior to this, I only have a basic understanding of SSR, and I was not sure how it works. Through learning how to migrate from Vue 2 and Vue 3, I have really learnt a lot about how SSR works and why is it needed. Below are a list of things that I have learned about SSR:

  • How does SSR work and what are the benefits of using SSR?
  • How does SSR differ from Client-Side Rendering (CSR)?
  • What is client-side hydration and how does it work in conjunction with SSR?

I am really glad that I have learnt about SSR as SSR will is getting more and more prevalent in the industry.

Resource used:

  • Vue docs: This is super helpful in understand about SSR in vue.
  • Other than that, I think a lot of SSR and even CSR knowledge is learned from ChatGPT.

Webpack

Previously, in all honesty I did not really know what Webpack is and how it works. Webpack seems to just be a magical tool that just works. Through updating of Webpack and attempting to migrate Vue 2 to Vue 3. I learned a lot about Webpack and what it does. Below are a list of things that I have learned about Webpack:

  • What is Webpack and how does it work? (bundling etc.)
  • Learned about different types of Webpack plugins

Resource used:

  • Webpack docs: This is the main resource I used to learn about Webpack.

I learn how to use Open Source Software

This is quite a random learning point, but I think it is important to mention. Somehow, I handled a lot of upgrading of dependencies. I learned how to safely update dependencies and ensure that it does not break a codebase. I also learned the importance of reading the changelog and release notes of dependencies.

I had a painful experience when I had to debug why html format has changed after running npm install. The reason for this is that the developer of js-beautify has changed the way it formats custom tag but did not mention it in the release notes. This caused a lot of tests to fail and I had to spend a lot of time debugging it.

Through this experience, I learned the importance of ensuring that changes are documented properly and correctly.

LEE WEI, DAVID

Vue.js

One of the largest takeaways from working with MarkBind in the last semester has been Vue.js, an open-source front-end framework that MarkBind uses to build it's UI components. Previously, only knowing the React.js framework, Vue.js is a handy addition to my arsenal. The basics of Vue.js was rather simple to pick up. Reading the Vue.js documentation, and referencing examples of already implemented Vue components in MarkBind, I quickly understood the use of <template>, <style> and <script>. Through Markbind's Developer Guide, I learnt how to easily create different kinds of Vue components and implement them in MarkBind.

As I implemented my first Vue component, Add autogenerated breadcrumbs component #2193, I delved deeper into Vue, exploring the use of data(), to manage the internal state of Vue components, and methods() to define methods to be used within the component. I also learnt more about Vue lifecycle hooks, in which I used the mounted hook to allow the Breadcrumb component to query the SiteNav to figure out the hierarchy of the current page.

As I continued working on improving MarkBind's frontend, I learnt more about Vue's <transition> component, in particular using transition hooks. While I was working on Fix Quiz expanding between questions #2184, I came realize how useful these hooks were, helping to create seamless transitions for different situations. I relied heavily on Vue.js documentation and StackOverflow Posts as I was researching about Vue's transition hooks.

Document Object Model (DOM) Manipulation

When I was working on implementing the new Breadcrumb and Collapse/Expand All Buttons components, I had to extensively use Document.querySelector() and other related methods. I was new to this and had to do some research about how the methods work, what happens if the object cannot be found and handling edge cases. By practicing these while implementing the two components mentioned above, I believe that I have become more proficient in doing this. As a side-effect of this, I have also gained a deeper understanding on how the DOM works.

Resources:

Jest/Vue Test Utils

Jest and Vue Test Utils were something that I was new to coming into MarkBind. MarkBind uses Jest together with Vue Test Utils for its snapshot tests, which test Vue components against their expected snapshots. As I was updating and implementing Vue components, I had to update and create the relevant test suites to ensure that the Vue components that I was updating or creating were working as expected. I explored mounting the components, attaching components to a document to allow another component to interact with them.

Resources:

Typescript

As MarkBind is undergoing a migration to Typescript, I put in some time to learn basic Typescript. This was important as mid-way through the semester, as many of the files were being migrated to Typescript. This has also helped me in reviewing PRs that deals with Typescript migration and PRs which affect the Typscript files in MarkBind.

Resources:

UI

When updating the looks of old components and creating new ones, I had to do some research about what makes a website visually pleasing. My most interesting finds were about the use of golden ratios in design and choosing complementary colours with tools such as Canva's Colour Wheel. I also learnt the different meanings of different icons through exploration and discussions with Update Breadcrumb icons #2265 and Add CollapseExpandButtons component.

I also internalized how to create transitions and effects that fit with the theme of the project, for MarkBind, had a more minimal theme. This was done when updating designs of components in Tweak sitenav design #2204, Update Question/Quiz component design #2131.

Project Management

As I progressed to start managing the project, I started reviewing and merging PRs. Initially as I reviewed smaller PRs, I had little problem understanding the code and understanding where it can be improved. However, as I reviewed more complex PRs, I began having difficulties understanding the changes quickly. I came across a method to understand code in a more simple manner, the Rubber Duck Debugging method. Using this helped me to try and understand the code line by line and handle more complex changes more managably, helping me to understanding them better.

Lee Hyung Woon

Tools / Technology used within MarkBind

Vue.js

One of the main technology I learned during the course of CS3281 was Vue.js, an open source JavaScript framework for building UI components. Previously, I have dabbled a bit in Vue.js, but not to the point where I could even call myself "familiar" with it. In order to work on some of the components (mainly for the implementation of the new Toasts component), I had to learn Vue.js and how to implement a Vue component in MarkBind, e.g. how the different parts in a Vue component (namely <template>, <script>, and <style> sections) work and interact with each other, what are the different lifecycle hooks and event handling available in Vue.js, the fundamentals of reactivity in Vue.js, etc.

The resources I used consist of:

  • Vue.js Documentation: This helped me get started with Vue.js, and reading through each section gave me a clearer understanding of how Vue.js worked, and the basics. However, to actually use it in MarkBind was a different problem...
  • MarkBind Developer Guide - Writing Components: ...which is where this section of the Developer guide came in. This section was very helpful in guiding me through the specific aspects of Vue.js that we are concerned with (and the Section on SSR was tremendously helpful in resolving some of the issues!).

Of course, as I became slowly more familiar with Vue.js and the Vue components, I started realizing the benefits of using Vue 3 would bring over Vue 2. For instance, Dynamic CSS classes available in Vue 3 but not Vue 2 is something that I encountered the need for during the implementation of the toasts component. As the course progresses, I expect to help out where I can with the currently ongoing Vue 2 to Vue 3 migration.

Nunjucks

Nunjucks is a templating engine for JavaScript, developed by Mozilla. I encountered a need to investigate Nunjucks further when I was working on a issue with the {{ raw }} and {{ endraw }} tags in MarkBind, which was a way to work around the double curly braces ({{ and }}) being processed as a Nunjucks variable. While I did not fully learn Nunjucks during this investigation, I nevertheless managed to learn about how variables are processed in Nunjucks, and how the Nunjucks syntax works.

The resources I used consist of:

  • Nunjucks Documentation: This was the primary resource I consulted to learn more about the behavior of Nunjucks and the available syntax.
  • MarkBind User Guide - Tips and Tricks: This section gave me hints on where to proceed for investigating why the Nunjucks syntax was causing problems for MarkBind. The section has since been re-written to be more informative (by me!) regarding how to avoid having variables be pre-processed by Nunjucks.

TypeScript

While I was fairly familiar with TypeScript (along with HTML / CSS / JavaScript) prior to working on MarkBind, contributing to the ongoing TypeScript migration of the core MarkBind package has helped me better understand the strict features (and philosophy) of TypeScript. Hence, I thought that it at least deserves a mention in this section.

The resources I used consist of:

  • TypeScript Reference: This helped me quite a bit when trying to understand how to get started with the TypeScript migration for MarkBind.
  • MarkBind Developer Guide - Migrating to TypeScript: This section was excellently written - frankly, I think that the step-by-step process was vital for my understanding of how the TypeScript migration should work.

External Tools / Technology

Static Site Generators ("Competitor Analysis")

While working on the templates and the CLI aspects of MarkBind, I found that I needed to be at least familiar with how other Static Site generators do things. I ended up spending quite a bit of time looking into 5 of our "competitors" (though they do fulfill different niches) in particular: Hugo, Gatsby, Jekyll, Docusaurus, MKDocs.

What I learnt from their documentations (and subsequently trying them out myself to generate sites) is difficult to list, as it mainly involves learning about the available features as well as how they tackled certain issues. However, some of the comments I have left on MarkBind issues do showcase parts of my learnings:

The resources I used mainly consist of the documentations for each of the static site generators:

I believe that as I progress through the module, I will learn more about other static site generators (that can help to give further insights into the directions that we want to push MarkBind towards).

Chan Yu Cheng

Vue

Vue components

Since Markbind uses Vue components, I had to pick this up, having only experience with React before. I had to learn what is a Vue instance, how to compile Vue and so on. Resources I used included the Markbind dev page regarding SSR of course, the Vue Official Documentation and another Vue tutorial. This was especially useful when dealing with Vue templates in one of my PRs about jQuery, which logged an warning since there was a script tag in the template. I had to learn about side effects in Vue from resources such as this stackflow post about Vue disallowing side effects for not just script tags but also style tags.

Vue test utils

Since every new feature in Markbind required unit testing, I had to create unit tests for the scroll top button component. Therefore, I had to learn how to use Vue Test Utils and its snapshots. I had to learn how to

  • deal with setTimeout. This was probably the hardest part as trying to mock the setTimeout (following this tutorial) and using nextTick (using this tutorial) on Vue test did not work. I had to resort to using setTimeout in the test to wait out the setTimeout in the component.
  • mount components with props and attached to a document
  • dispatch events to trigger the scroll event needed for the scroll top button component
  • test if a function has been called

Jquery, Cheerio and Javascript for DOM manipulation

As I had to write a plugin and remove jQuery, I became a lot more familiar with DOM manipulation using Cheerio, jQuery and vanilla Javascript. Since I had to remove jQuery from Markbind, this page was very useful for me to understand how to convert from jQuery to vanilla Javascript. I also learnt from the jQuery API documentation about each functions' behavior, especially more advanced functions like wrap and on. Through this PR, I became more familiar with using vanilla Javascript for DOM manipulation as well. For example, how to create elements, add styling and scroll etc. Because the contact form plugin required DOM manipulation, I used cheerio for this PR and learnt about its API calls. The cheerio API documentation was very helpful in my understanding of the calls.

CSS

As I worked on some front end bugs, I had to learn more about CSS. Specifically:

  • How styles override each other. The iconColor in bozes are not working in certain circumstances due to Bootstrap styling. Hence, I had to learn more about overriding in CSS styles. This guide was useful in teaching me about it and I also about the important property in CSS which was what was causing the bug
  • Transitions. The panel transition had some errors with an abrupt transition, which was a bug regarding the CSS transitions. I learnt about how CSS transitions from 0 to the max-height, and if the max-height was not set correctly (in this case it did not include margins) there would be problems.
  • CSS selectors. As I had to style the form plugin and also had to finish up a PR regarding standardising tab buttons, I learnt about CSS selectors used for styling, from selecting by tag to by descendents. This guide was useful for my learning

Typescript and migration

As I did a typescript migration, I learnt to code in it. Specifically

  • Different ways to import and export files and functions in Typescript, which differed significantly from Javascript. The Markbind documentation on that was very useful in my understanding.
  • Defining and importing types. I learnt that npm packages often defined interfaces for their packages, making it easier to convert.

I learnt about the simliarity index about github files. According to the typescript migration documentation, there was a need to have two seperate commits, one to rename and one to adapt. If both were done within one commit, the simliarity index would be below the threshold and the commit history would be lost. This is something that I would definitely take note of in the future when renaming files.

I had to do a squash commit for the typescript migration. I learnt about the differences between rebasing vs merging through this article and about the pros and cons and dos and don'ts for rebasing. I was encountering the problem where squashing the merge resulted in the PR containing the recent commits from the master branch. This taught me not to miz up merging and rebasing and just do one or the other.

Github actions

I learnt how to configure Github actions as I had to upgrade node version from 14 to 16 in Markbind/markbind-action and to also remove some depreated syntaxes. I learnt about the workflow of github actions and it's purpose. I also had to learn how to test GitHub actions. I followed the tutorial in Markbind Dev Guide on testing and also attempted to use VSCode Extension for Github Actions to test more effectively following this tutorial.

Nunjucks

I learnt about Nunjucks when making documentation updates. Specifically, regarding Nunjucks Macros: how to declare them, write if statements and how to use them. The documentation on Nunjucks are particularly useful.

Node.js versioning

I learnt about Node.js versioning when upgrading the Node version and when writing the documentaiton on migrating Node js. For example, odd numbered Node.js versions are unstable and will reach end of life sooner, while even numbered Node.js version will be maintained for a longer period. I also learnt the difference between a major release and minor release, with the and containing major and breaking changes, while the later has smaller changes which are not breaking.

Deployment

For the node.js version upgrade, I had to check that the deployment was ok with the higher version. I therefore had to learn how to deploy the sites with github pages,CircleCi and Appveyor and Surge. I didn't do it with Travis due to a persistent account error. I followed the tutorial in the Markbind tutorial to learn the deploying for each CI platform.

Reuse principle

There was an issue with Markdown using include within another include, because the outer variable was overriding the inner variable. This causes a cyclical reference error. I was told that this was inline with the golden principle of reuse, where inner variables should be allowed to override the inner variables so that components can be reused without having to change the inner contents.

Documentation

I learnt about the importance of good documentation and how to manage documentation. I think it is quite easy to keep adding things into documentation but it is more important to be able to present information in a way that is presentation and pallatable to users. For example, Markbind had an issue with cyclical references in includes and it would be good to document this. However, since it was an edge case, it was recommended to instead use a panel which was not as noticable so users could easily skip over it if not applicable to them.

RepoSense

Marcus Tang Xin Kye

DevOps

Gradle

Gradle is a build tool designed specifically to meet the requirements of building Java applications. Once it’s set up, building an application is as simple as running a single command on the command line. Gradle performs well and is also useful for managing dependencies via its advanced dependency management system.

Learned about Gradle through a really helpful tutorial.

Bash and Batch Scripting

I learned how to write basic bash scripts via tutorialspoint, and had to implement batch scripts to perform environmental checks for all files tracked by git, to ensure they end with a newline, no prohibited line endings (\r\n) are present and no trailing whitespaces are present.

Some interesting bugs were encountered when attempting to use pipes in batch files, particularly one that prevents delayed expansion of variables from being properly evaluated as per usual. This is due to variables not being evaluated in the batch context, as the lines are executed only in the cmd-line context. A more detailed analysis of the bug is done by a user of stackoverflow.

Codecov

As I explored Codecov to determine why it would intermittently fail for GitHub actions, I developed a greater appreciation for the role of code coverage analysis in ensuring software quality. I found its integration with popular CI/CD platforms to be seamless, making it easier to track and improve code coverage across projects. The visualization tools, such as the sunburst graph and diff coverage reports, were especially helpful in identifying areas that needed more testing attention. Furthermore, learning about Codecov's ability to enforce coverage thresholds and generate pull request comments reinforced the importance of maintaining high-quality test suites.

Frontend

Vue

Vue is a progressive JavaScript framework that simplifies the creation of responsive and efficient web applications. Its reactive data-binding and component-based architecture promote modular programming, resulting in more maintainable and scalable code. Learning about Vue's component-based architecture also expanded my understanding of modular programming and how it can lead to more maintainable and scalable code.

Pug

Pug is a templating engine that integrates well with Vue, allowing for cleaner and more concise HTML code with the use of whitespace and indentation for structure. By removing the need for closing tags, Pug attempts to make code more readable and organized. Its support for variables, mixins, and inheritance facilitates code reusability and modular design, improving the overall structure and readability of templates.

Cypress

Cypress is an end-to-end testing framework that simplifies the process of writing and executing tests for web applications. Its intuitive syntax, real-time reloading, and support for network stubbing improve debugging and development efficiency, emphasizing thorough testing. I found its syntax and API to be intuitive and user-friendly, making the process of writing and executing tests more enjoyable. I was particularly impressed with the real-time reloading feature, which allows for faster debugging and development, simplifying E2E testing.

Backend

Bloch’s Builder Pattern

Bloch’s Builder pattern is a design pattern that simplifies object instantiation in Java, particularly for classes with numerous constructor parameters, as it simplifies the process of object instantiation while maintaining immutability and improving readability. This was a particularly useful design pattern when refactoring the CliArguments.java class, as it had a large number of constructor parameters, and also required flexible construction as some of the fields were optional. The pattern facilitates immutability and reduces the risk of introducing errors in complex Java classes. Read more about here on Oracle's blog.

Polymorphism

Polymorphism is a core object-oriented programming concept in Java that allows objects to adopt multiple forms and behaviors based on their context. It promotes code cleanliness, extensibility, and reduces coupling between components, resulting in more flexible and modular applications that can evolve and scale easily. By leveraging polymorphism, I was able to reduce the amount of logic in the main method of RepoSense.java, by utilizing RunConfigurationDecider to return the appropriate RunConfiguration based on the CliArguments, where the config can be from getRepoConfigurations().

Discrete Event Simulator (DES)

Discrete event simulator (DES) is a method used to model real-world systems that can be decomposed into a set of logically separate processes that autonomously progress through time. This was a design that was well suited for designing a CLI Wizard, as it allows for maintaining of a deque of prompts that to be shown to the user, while also allowing the addition of new prompts into the deque depending on the user's responses.

Misc

Git Commmands/Functionalities

In RepoSense, a variety of git commands are utilized to get information about the repository. Through undertaking DevOps tasks, I was also exposed to other interesting git commands. Here are some of the interesting ones that I was not aware of before.

git shortlog - Summarizes git log output, where each commit will be grouped by author and title. This is used in RepoSense to easily count the commits by the users.

git grep - A powerful tool that looks for specified patterns in the tracked files in the work tree, blobs registered in the index file, or blobs in given tree objects. Patterns are lists of one or more search expressions separated by newline characters. An empty string as search expression matches all lines. Utilized to write Reposense scripts to perform environmental checks for all files tracked by git, to ensure they end with a newline, no prohibited line endings (\r\n) are present and no trailing whitespaces are present. Used git docs to learn how to use git grep properly and what its various flags do.

.mailmap - If the file .mailmap exists at the top-level of the repository, it can be used to map author and committer names and email addresses to canonical real names and email addresses. This is useful to map multiple authors and commenters and provides a way to share the mapping with all other users of the repository. Used git docs to learn how to configure git mailmap properly.

URL Shortening

Researched interesting solutions for free URL shortening, looking into 3 main ways to do it. Read about an in-depth writeup in the Github issue here.

Charisma Kausar

1. Tools and Technologies

The RepoSense frontend is built with Vue.js and Pug, with most of the JavaScript files being migrated to TypeScript over the semester. Node.js is used to manage the packages, while static code analysis is performed with ESLint. Cypress is the tool of choice for testing the frontend.

1.1 Vue.js

Prior to working on RepoSense, I had experienced working with Vue.js using Vuetify components and the Options API. However, working on the project allowed me to delve deeper into the intricacies of Vue and how to fully utilize its features.

1.1.1 MVVM Architecture Pattern

Vue.js focuses on the 'ViewModel' layer of the MVVM (Model-View-ViewModel) architectural pattern. This is because it connects the Views and Models via 2-way data bindings. In this case, the view is the DOM (Document Object Model), and the models are the plain JavaScript objects.

1.1.2 Leveraging Template Refs for Custom Behaviors

While Vue has a rendering model that abstracts away direct manipulation of the DOM, sometimes it is necessary to have access to the DOM to programmatically control an element. Hence, Vue gives us access to $refs, which are similar to document.querySelector('.element') in JavaScript, but are more efficient as they give direct access to the element needed rather than returning the first element that matches the given selector. This allowed me to implement custom behaviour such as pinning the file title within Vue.

1.1.3 Reusability with Custom Directives

Reuse of code is an essential concept in software engineering, which is why Vue offers custom directives. Custom directives allow the reuse of logic that involves low-level DOM access. They are basically objects containing lifecycle hooks similar to those of a component.

One of the custom directives that RepoSense was already utilizing was a plugin called vue-observe-visibility. This made use of the IntersectionObserverEntry Web API to observe whether an element is in view and execute a function accordingly.

During my work on the pin file title PR, I encountered a bug where tooltips appeared out of the viewport when at the top of the page. As the file title would be pinned to the top of the page, this issue needed to be resolved before my PR could be merged. To address this, I thought of using a custom directive, and I utilized the vue-observe-visibility directive to modify the CSS of the tooltip to be bottom-aligned based on visibility changes. While this solution was successful, we required more customization as the tooltip needed to move back to being top-aligned when scrolling up. I eventually used template refs to address this issue, but this experience allowed me to understand better about custom directives.

1.2 Vuex

Vuex is a state management pattern and library for Vue that serves as the centralized source for all components. It enforces rules to ensure that the state can only be mutated in a predictable manner.

1.2.1 Single Source of Truth

During my work on a PR to differentiate between authors when using the 'merge group' option in RepoSense, I faced an issue with unsynchronised data copies. Initially, I had stored the colors assigned to authors in both a local data() variable and the Vuex store. To resolve this, I employed mapState as a Vue computed property to access the Vuex state from Vue components. This approach allowed me to re-evaluate the computed property every time the data changed, which triggered DOM updates and allowed a single source of truth. However, relying on the global store singleton could potentially be considered an anti-pattern as it would make the code difficult to test.

1.3 JavaScript

1.3.1 Dot vs Bracket Notation for Accessing Object Properties:

The dot notation (objectName.propertyName) is commonly used to access properties in a clean manner. However, it limits property identifiers to alphanumeric characters, _, and $. On the other hand, the bracket notation (objectName['propertyName']) can use all UTF-8 characters in property names or even variables that finally resolve to a string. This notation is useful when the property name is only known during runtime, as in this PR where this.$refs[file.path] is used because the reference to file.path is only resolved based on the file being interacted with.

1.3.2 ES6 String Interpolation for Cleaner Code

ES6 introduced template strings as a concise and readable way to insert values into strings. In contrast, the string concatenation approach can be harder to read and edit, and requires creating multiple strings that need to be put together. Moreover, string concatenation would take up more memory and computation compared to creating just one string.

1.4 TypeScript

TypeScript is an object-oriented programming language that allows for classes, interfaces, and inheritance support in the frontend. It provides static typing and type inference, making it easier to catch errors before runtime. Therefore, we decided to migrate our codebase from JavaScript to TypeScript to align our frontend with our OOP Java backend.

1.4.1 Class vs Interface for Typing

When working on my first PR for defining Vue prop types explicitly, I initially used classes in TypeScript. However, after gaining more knowledge about TypeScript, I realized that interfaces are more suitable for type-checking at compile time. Interfaces have less overhead since they do not exist at runtime and are erased when the code is transpiled to JS. Although classes can define methods relevant to class objects, this feature was not useful for us. In a later PR, we decided to switch to using an interface to improve the performance of the frontend.

1.5 Pug

Pug is a templating language that makes it easier to write reusable HTML components with cleaner syntax. It is useful when working with data-driven web applications like RepoSense. Although it can be challenging to find resources that provide documentation on using Vue and Pug together, Pug's syntax is much faster to develop in than HTML once you get used to it.

1.6 Sass and CSS

Sass is a CSS pre-processor and an extension of CSS. It helps reduce repetition in CSS and saves programming time by providing features like variables, mixins, imports, and inheritance. A Sass pre-processor transpiles Sass code into standard CSS as browsers can only understand plain CSS code.

1.6.1 Choosing between Placeholders and Mixins

The difference between mixins and placeholders is that placeholders consolidate mutually-shared code, whereas mixins just assign the properties to the individual classes — along with whatever was specific to that class. Because of this, it’s preferred to use placeholders. But since placeholders aren’t able to take parameters, it’s better to use mixins in such cases.

I had to decide between placeholders and mixins when trying to consolidate the code required for a tooltip tail, and assign it along with some specific properties depending on whether the tooltip was top-aligned or bottom-aligned. Hence, I made use of placeholders for this as they group together mutually-shared code. In another PR, I used mixins to standardize the fonts used throughout the frontend as fonts only need to be assigned to the CSS classes along with their other properties.

1.7 Cypress

Cypress is a powerful web testing framework designed for end-to-end testing. Unlike Selenium, it operates within the application, allowing high flexibility to access any objects in the app, including DOM objects and the window, similar to how we do within the code itself.

1.7.1 Effective and efficient test case design

To ensure effective and efficient test case design, I have targeted potential fault points with each of my Cypress test cases. However, I noticed repetitive Cypress commands in these test cases, which can be extracted into a common function for better reusability. While the rest of the codebase also uses such repetitive commands in all test cases for setup, we should plan to extract all the setup commands into a common function to allow for reusability.

1.8 Linting

Linting is the process of performing static analysis on code to identify programming or code style errors. While I have used code analysis tools of IDEs, I had not explicitly enforced custom coding rules using lints before.

1.8.1 Enforcing Custom Coding Rules with ESLint

During the migration to TypeScript, we decided to use the Airbnb style guide, similar to how we used it for JavaScript. Besides, we defined other custom rules, and I created a first-timer issue that deals with the consistent use of T[] or Array throughout the codebase. This helps enforce coding standards and make the code more consistent and maintainable.


The Backend for RepoSense is written in Java, and testing is done using JUnit. Since RepoSense is for contribution analysis, Git commands are highly used within the project. Gradle is used to manage the project dependencies and for DevOps tasks.

1.9 Git

1.9.1 Understanding git log

For working on the PR to include merge commits in the web dashboard, some backend changes were required as merge commits were not included in the generated report itself. Hence, I had to look into the docs of git commands, specifically git log, to understand what flags I could make use of to include all the desired commits in the report. Previously, we were using the --no-merges flag to remove all merges from the report. However, simply removing this flag did not help in including all the merge commits in the new report. This may be because git continues to simplify “uninteresting” merges in the default mode. Finally, the use of --full-history helped include all commits without merging any same content commits together. git log also had to option to format its output with a <format-string>, and this formatted output makes it easy for us to parse the results and generate our repository analysis reports.

1.9.2 Spoofing for Good

I was surprised by how easy it is to commit as someone else using Git as long as one has write access. I had to make use of this technique when I had to create a test commit, as only commits from a selected group of users are part of the Cypress test dashboard. I spoofed one of these users so that the commit to test appears on the test dashboard.


2. Software Engineering

2.1 Design choices

2.1.1 Object parameter vs multiple parameters for constructors

While creating a User object in TypeScript, I encountered the challenge of passing in a large number of arguments (~10) to construct the object. This made me wonder what the best way of initialising such objects with large number of attributes is. I was exploring the use of a single object parameter, as it makes the code much cleaner. However, there is a tradeoff of whether it would be type safe to just pass an object without any type as a parameter into the function. Yet, I decided to continue with the method of using an object argument as this issue of type safety could be mitigated in the future by checking that the object being passed in as the argument implements the User interface, when migrating to TypeScript, which was eventually done.

2.2 Reflections

2.2.1 Understanding a Language/Tool Before Working with It

Previously, I had the mindset of just making things work without understanding the inner workings of a language or tool. However, I realized that this approach only led to superficial knowledge, making each challenge as difficult as the last. This semester, I gained a new perspective on how understanding the language/tool can make things easier down the road. I now strive for a good balance of theory and practical knowledge to accumulate my understanding and improve over time.

2.2.2 Applying the "Make it Work, Make it Right, Make it Fast" Principle

While working on a PR to differentiate between authors while using 'merge group', I applied the principle of "Make it work, Make it right, Make it fast." Initially, I focused on making it work and fixing any edge cases. Later on, I refactored the code to optimize it. Additionally, I conducted performance analysis for the PR after it was complete, which can be accessed here.

2.2.3 Full-Stack Development Experience

Working on the show merge commits PR provided a chance to work on all aspects of the codebase as a frontend developer. I researched Git to find out how to include all merge commits, edited the Java backend parsers to include an additional field for whether a commit is a merge commit, and made frontend changes to include merge commits within the HTML report. Furthermore, I wrote test cases for frontend Cypress, backend unit tests, and system tests. This experience was rewarding as it allowed me to do full-stack development and learn how all the components work together while solving a single problem.


3. Project Management

3.1 Lessons Learned from Contributing to an Open-Source Project

3.1.1 Understanding the Contribution Workflow

Contributing to RepoSense has provided me with valuable insights into the contribution workflow for open-source projects. It has helped me understand the quality expectations that are necessary for maintaining a high-quality codebase. However, having strict rules can sometimes hinder the PR review process, leading to longer review cycles. Hence, to strike a balance between quality and speed, setting guidelines and maintaining effective communication channels is essential.

3.1.2 Importance of Documentation

Documentation is an integral part of open-source projects, and its importance cannot be overemphasized. It's easy to forget to update the documentation after making changes in a PR, leading to outdated documentation. Going forward, I recognize the need to maintain an up-to-date documentation to ensure that future contributors have access to accurate and comprehensive information. To this end, I suggest having a checklist in the PR issue template to remind contributors of the need to update documentation.

3.1.3 Optimal PR Length

I received feedback from my mentors that my PRs were too long, leading to difficulty in reviewing. It was suggested that breaking down the PRs into smaller ones would make the review process easier. Based on this feedback, I have made a conscious effort to create smaller PRs going forward.

3.1.4 Understanding Versioning

Contributing to RepoSense has provided me with insights into versioning and how it is maintained for open-source projects. The process of maintaining separate web pages for documentation of released versions and the master version has been an important lesson. To deepen my understanding of project management, I am planning on making a release myself in the near future.

Chang Si Kai

Regular Expression

Java provides regular expression through the java.util.regex package, which consists of three classes: Pattern, Matcher and PatternSyntaxException.

  • Pattern is a compiled representation of a regular expression. It must be created via static methods, most commonly Pattern.compile(String regex).
  • Matcher then interprets the compiled pattern and matches against an input String.

I would like to touch on the more interesting aspects of Java's implementation of regex that I encountered along the way.

  • I found it confusing initially that in Java, in order to specify predefined character classes such as \s for whitespace characters, we have to first escape the backslash within the String representation of the regex argument (so "\\s" instead of just "\s"). While this is consistent with the way Java handles escape characters in String, it caused me some minor confusion and readability issues as it was unlike other major programming language such as JavaScript and Python.
  • Greedy quantifiers: X?, X*, X+ and more. Special care must be taken while using them due to its greedy nature. In one instance, I was attempting to rewrite a regex that matches using stricter rules. I was under the pretext that my regex was working fine as it matches correctly with positive test cases, however upon further investigation, it only matched because it disregarded the remaining regex sequence due to its greedy nature.
    • I found https://regex101.com/ to be an excellent as a sanity check in this department. It breaks down our specified regex input into different capturing groups, and highlights the matches and groups accordingly.
  • Looking through the JavaDoc, I also found two other related quantifers:
    • Reluctant quantifiers: X??, X*?, X+? and more, where the extra ? at the end demarcates it as a reluctant quantifer.
    • Possessive quantifiers: X?+, X*+, X++ and more, where the extra ?+ at the end demarcates it as a possessive quantifer.
  • The difference between greedy and reluctant is that:
    • For greedy, the matcher "eats" the entire input before attempting to match. If there is no match, the matcher backs off the input string by one character and tries again until a match is found or no more characters are left.
    • For reluctant, the matcher starts off at the beginning of the input string, "eating" one character at a time to look for a match. It stops the moment a match is found or there is no more characters left to "eat".
  • Possessive quantifiers starts with the entire input and never "tracks back" even if doing so allows the match to succeed.

Git Clone Bare

This clones only the .git subfolder, and makes it the main directory cloned.

Git Shallow Clone

This allows us to pull down only the latest commits and not the entire repo history. This can be achieved by specifying depth. The benefit of doing shallow clone is that we can clone faster due to fewer files being cloned. In RepoSense's case, we utilize --shallow-since flag, as it fits our use case better than --depth flag.

Synchronization

Communication via threads happens primarily through sharing access to fields and the objects reference fields refer to. However, this introduces new kinds of errors in thread interference and memory consistency errors.

Thread interference happens when two operations running in different threads, and acting on the same data interleaves.

Memory consistency errors occurs when different threads have inconsistent views of what should be the same data.

  • Requires the happens-before relationship which is simply a guarantee that memory writes by one specific statement are visible to another statement.

Java provides synchronization as a tool to prevent these new forms of errors. It is an action that creates a happens-before relationship.

  • Synchronized methods:

    • It becomes not possible for two invocations of synchronized methods on the same object to interleave by blocking all other threads first.
    • When a synchronized method exits, it automatically establishes a happens-before relationship with any subsequent invocation of a synchronized method for the same object. This ensures that the changes to the state of the object are visible to all threads.
    • Constructors cannot be synchronized as it does not make sense, since only the thread that creates an object should have access to it while it is being constructed.
    • final fields cannot be modified after the object is constructed, so it can be safely read through non-synchronized methods.
  • Intrinsic/monitor lock

    • Enforces exclusive access to an object's state and establishing happens-before relationship which are essential to visibility.
    • Every object has an intrinsic lock associated with it. A thread that needs exclusive and consistent access to an object's fields has to acquire (and said to own) the object's intrinsic lock before accessing them, and release the lock when it is done with them. No other threads can acquire the same lock during this time.
    • For static synchronized methods, the thread acquires the intrinsic lock for the Class object associated with the class.
  • Synchronized blocks

    • Allows us to synchronize only some instructions within a method. Requires a monitor object to be passed to the synchronized block, most commonly this parameter. Class object is used in place of this for static synchronized blocks
  • Reentrant synchronization

    • Threads cannot acquire a lock owned by a thread. However, a thread can acquire (again) a lock it already owns.
    • Happens in a situation where synchronized code directly or indirectly invokes a method that also contains synchronized code, and both sets of code use the same lock.

GitHub Actions

GitHub Actions workflows (defined in .github/workflow) are triggered when an event occurs in the repository. For example, when an issue or a pull request is created, we can define the actions to be taken through the workflow .yml files.

  • Jobs

    • A job is a set of steps in a workflow executed on the same runner. Each step is either a shell script or an action. Since each step is executed on the same runner and is dependent on each other, we can configure what should happen if a step fails, etc.
  • Runner

  • -A runner is a server that runs a workflow when they are triggered. GitHub provides Ubuntu Linux, Microsoft Windows and macOS runners to run the workflows. Each workflow run executes in a fresh virtual machine.

  • General .yml file

  • I will highlight some essential aspects of a typical .yml file.

    • on: This allows us to specify what event will trigger the workflow.
  • This is followed by a list of jobs.

    • Within each job, we can specify:
    • runs-on: what type of runner the job should be run on,
    • steps: a list of steps, within which we can specify:
      • run: We can use this to run scripts and shell commands. To run scripts, we will have to store the script in our repository and supply the path.
  • Sharing data between steps

    • By using GitHub environment variables, we can share data and information between steps. RepoSense uses the GitHub output environmental variable to extract the PR number for use in a future step to help download deployment status artifacts to deploy development previews. We can access it by steps.<step_id>.outputs.<variable name>.
  • Sharing data between jobs

    • It is possible that we might want to share data and information across different workflows. To set up deployment previews on Surge.sh, RepoSense requires some information (e.g., PR number, generated MarkBind documentations) that is obtained in the main integration.yml workflow. To do so, we can upload artifacts to GitHub using upload-artifact found in the GitHub marketplace, then download it on another workflow.

David Gareth Ong

Vue & Options API

Although I was already familiar with Vue, I only ever used the newer composition API, and thus had to learn the Options API that is used in the frontend of RepoSense.

I got familiar with the API as I worked through the implementation of this PR, which involved a decent amount of refactoring across multiple Vue files. The main resource that I used was the official Vue docs, as it provides a comprehensive yet easy to understand overview of the different concepts. Additionally, it has a toggle to switch between the Composition API and Options API for each page of the documentation, allowing people who are already familiar with one to easily pick up the other.

Here are some of the main things that I learnt:

Importance of computed properties

In RepoSense, there are many properties that we need to calculate/obtain when other properties are changed. For instance, in the zoom panel, we need to maintain a list of commits to be displayed. This list needs to be re-calculated based on other properties, such as the author that is currently selected, the filters applied to the commits (e.g. only show commits in .js files), etc. In Vue, such properties should be implemented as a computed property under the computed object in the export.

The main advantage of computed properties are that they are cached, and are only re-computed when one of their reactive dependencies are changed. In the above example, this would be equivalent to our list of displayed commits only being re-computed when the currently selected author is changed, or a filter is added/removed. This significantly improves performance, as if we were to implement the computation of a list in a normal method, it will be re-computed on every re-render, even if the re-render was triggered by an unrelated reactive item - resulting in the unnecessary re-computation of the same value. In a frontend application like RepoSense's reports where there are many such properties, utilising Vue's computed properties provides a much needed performance boost.

Avoiding direct manipulation of DOM

One of the main advantages of using a framework like Vue is that certain aspects relating to modifying the DOM are abstracted away from the user. Vue handles reactivity for the user, by updating the DOM when reactive state is mutated. Hence, problems can arise when users bypass this functionality of Vue and manually modify the DOM within Vue components. This is because Vue has no knowledge of these modifications, resulting in potential modifications clashing with Vue's mutation of the DOM.

This PR involved deprecating the use of a method that manually modified the DOM in order to toggle the show/hide state of commits. This method of toggling commits involved a manual mutation of a CSS class, while there was a synchronous method that calculated and updated the number of shown/hidden commits based on this CSS class, which was stored in a reactive variable. However, since Vue's updates to the DOM are asynchronous, this resulted in the variable always being one action behind the 'true' state, which caused an incorrect display of the show/hide all commit messages text. This problem was fixed by working 'within' Vue - modifying a reactive variable on toggle change, and letting Vue handle the DOM mutation. Hence, we should always try to solve the problem within the framework, and try as best as possible to avoid direct mutations of the DOM.

Deep vs Shallow copy when passing data

When passing data between components, care should always be taken with regard to how the data is passed, and the consequences of any mutations of that data. If mutations to data only make sense within the context of a particular component, then it is preferable to pass a deep copy of the data to prevent said mutations from changing behaviour outside of its scope.

Cypress testing

RepoSense utilizes Cypress for E2E testing, where the tests run in an actual browser that accesses the entire web page by URL, as opposed to only a particular view/component. The Cypress docs is a great resource for learning how to write tests, and was the main resource that I used when learning.

Test Isolation

One of the main things that confused me at first was why Cypress was configured to 'start from scratch' for each test case, i.e. it starts from the beginning of the RepoSense report/from a reload of the entire app for every single test case. After reading the corresponding page of the docs, I learnt that this was important to ensure the consistency & usefulness of each individual test case. By resetting the DOM state before each test, it ensures that each test functions independently, which in turn ensures that the running of any test does not impact the outcome of other tests. Otherwise, there might be a scenario where test case A passes, but causes a change that results in test case B failing. In this case, the results of the tests might be misleading, as the failure was a result of actions not confined within the test case itself.

Along a similar line, testing of functionality should be isolated whenever possible. One of the test cases that I wrote was to test that the toggle state of a file persisted after sort. My original idea was to toggle the state of the first file, then change the sort order from 'descending' to 'ascending' and checking the toggle state of the last file. However, this implementation relies on the correctness of the sort functionality, and hence an error in the sorting function might result in this test case failing, which would be misleading. Therefore, in the actual implementation, the file is tracked by file path and searched for after the sort, which isolates this test case from the correctness of the sort functionality.

TypeScript

Importance of enums

TypeScript supports enums similar to other languages like Java. Having enums is very convenient when you want to restrict a variable to a certain set of values, instead of an entire type. For instance, in RepoSense, there are many instances where variables only take specific values, such as a sort type only having valid values equal to "groupByNone", "groupByRepos" and "groupByAuthors".

In this instance, typing the variable as a string would technically be correct, and it would detect errors in the scenario where the variable is set to other types. However, it does not detect errors in the case where the variable is an invalid string, such as GroupByNone or groupByNon. This can easily occur due to a developer error, as the string is also manually referenced throughout the codebase (e.g. filterType = 'groupByNone', if (filterType === 'groupByNone'))) etc. If a typo was made in one of these references, it would still compile properly without warnings, but the bug would exist in production. Using enums helps us twofold - first, we can replace all manual references with the enum instead (e.g. filterType = FilterGroupSelection.GroupByNone, if (filterType === FilterGroupSelection.GroupByNone)), which would prevent any individual typos (and a typo such as FilterGroupSelection.GroupByNon will fail to compile as the type does not exist), and secondly, if we want to change the string itself, we just have to change it in the enum definition instead of everywhere in the codebase.

Type Predicates

Often, we have objects of similar types that work closely together (for instance, both stored in the same array), and we might have to distinguish between them in certain circumstances. For instance, in RepoSense, when we initially read the commits from the report generated by the backend, we store these commits as a certain type, but in the frontend we process these raw commits to add certain attributes to form a different type that inherits from the former. Hence, in the codebase, it is important to distinguish between these two types.

In order to narrow the type (e.g. if we have an object that can be either one of the two types but we want to be certain which one it is), we can use Type predicates, which narrows down the object type based on the compatability of that object. For instance, in RepoSense, I used this type predicate to differentiate between the Commit and DailyCommit type. The Commit type extends DailyCommit, but has an optional field deletions. The type predicate checks whether the field deletions exists on the object, and uses that to determine the type of the provided object. This is used in code where we have a bunch of objects which are minimally Commits, but we are not sure if they are DailyCommits. The type predicate allows us to distinguish this, and therefore safely access the appropriate objects (that have the deletions field) as DailyCommits.

Type inference vs Explicit types

One difference between TypeScript and traditionally strongly typed languages is the type inference feature of the TypeScript compiler. In TypeScript, in certain cases such as function return types, the compiler can infer the type of the return object from the code itself without explicit declaration from the programmer. For instance, consider the following code:

function double(x: number) {
  return x*2;
}

TypeScript automatically infers the return type to be number in this case. Of course, we can define the return type explicitly:

function double(x: number): number {
  return x*2;
}

Which in this case would be functionally equivalent, with a small difference being explicit declaration reduces overhead, as the compiler doesn't need to do the inference. However, aside from that, at first glance explicit declaration seems redundant, as the compiler can do inference. However, solely relying on inference can be prone to bugs, because the compiler assumes that your function definition is correct. Consider this example:

function double(x: number) {
  return (x*2).toString();
}

Although the above is an obvious bug (assuming we want double to return a number), the TypeScript compiler doesn't know what the user wants, and so happily infers the return type as string. Essentially, the compiler will never throw an error on inference, because it assumes the user's function implementation is correct (as it doesn't know the user's intention). However, if we know that we want the double function to return a number, and use an explicit return type declaration, the same code will result in an error:

function double(x: number): number {
  return (x*2).toString();
}

Hence, there is value in explicit type declarations, which is essentially telling the TypeScript compiler "I expect the function to return this", which allows TypeScript to check whether our function does indeed return the expected type, which adds value and improves type safety.

TEAMMATES

Sim Sing Yee, Eunice

Frontend

Angular

I have had previous experience working with Angular, so I knew of some basic concepts.
However there are still new things I learned while working on the onboarding task and review PRs:
    1. **Angular Pipes**:
        - It is preferred to use pipes over manually transforming the data
            - To illustrate:
                ``` html
                <!-- Function -->
                <p>"Hello World".toUpperCase()</p>

                <!-- Pipe -->
                <p>{{ Hello World | uppercase }}</p>
                ```
            - better performance: as pure pipes only execute their `transform` method when the underlying value changes.
            - reduce code re-use: pipes can be declared once and used throughout the application.

        - There are 2 different types of pipes
            - pure pipe: only runs when the underlying value changes.
            - impure pipe: runs every Digest cycle / in almost every change-detection cycle.
                - To create an impure pipe, set `pure: false` in the pipe's declaration.
                ```javascript
                @Pipe({
                    name: 'myPipe',
                    pure : false
                })
                ```

            If the input to the pipe is an object with nested fields, if there are any changes to the nested fields, a pure pipe will not detect the change.
            This can be fixed by creating an impure pipe, but the performance of an impure pipe is significantly worse as `transform` executes a lot more frequently, which is especially worse when input to the pipe is large (eg. a list).

    Resources:
    - [Pipes: improved performance](https://zmushegh.medium.com/why-use-pipe-instead-of-function-in-angular-507cf972bfb0)
    - [Pure vs Impure Pipes](https://upmostly.com/angular/understanding-pure-vs-impure-pipes-in-angular)

Bootstrap

I had only used pure CSS before this module so this is the first time I worked with a CSS Framework like Bootstrap.
When reviewing PRs on TEAMMATES, I learned about the various global styles Bootstrap provides.
For example:
- Grid system, `col` & `row`: built upon CSS's flexbox architecture; Bootstrap provides functionality to control how the column / row changes.
    - example: `class="col-6 col-lg-4"`: 50% wide when device specs < 992px (ie. `lg` grid breakpoint) and 1/3 of width on devices specs >= 992px
    - 6 default grid tiers: `sm`, `md`, `lg`, ...
    - if no unit/number provided, bootstrap will distribute the space equally (eg.`class="col"`)

It is preferred to use Bootstrap classes where possible rather than creating our own responsive CSS.

Resources:
- [Bootstrap Docs: Grid System](https://getbootstrap.com/docs/5.3/layout/grid/)

Snapshot testing

I have never used snapshot testing before and only heard of the concept in passing.
From this module, I not only learned about how snapshot testing is done on TEAMMATES (from TEAMMATES' dev docs), but also did a bit of research into how snapshot tests are used to ensure no unexpected changes in the UI.

Backend:

PostgreSQL (database)

For local development on TEAMMATES, Docker creates a running instance of PostgreSQL database.

ORM (Object Relational Mapping)

Before working on the data migration project, I have never worked with an ORM before, or even known of the concept of it.

Over the course of the project, I learned what an ORM is, and how it makes backend development easier by mapping between code written in an OOP language and data in a relational database by simplifying and reducing time wasted on handling data manually.


Additionally, I gained first-hand experience working with the Hibernate ORM.

Hibernate (Java ORM Framework)

In particular, v9 migration uses Hibernate (an ORM framework for Java environments) for TEAMMATES.

Over the course of the project, I learned some of Hibernate's concepts:
- states in a Hibernate session (Transient state: not yet attached to a session, Persistent state: associated with a session)
- **Persistence Context** & cache-memory / caching in Hibernate (including how the first-level cache, second-level cache and managed entities work)
    - Persistence Context: it is a staging area that sits between the code in TEAMMATES and the PostgreSQL database; the concept is implemented in Hibernate `session`.
    - Hibernate's `session` manages all the data loaded into it, keeps track of any changes to the data and is responsible for updating the data in the database later.
    - Managed Entity: a record in the database that's been loaded into a Hibernate session, and is managed by that session (ie. track any change to the entity and updates database accordingly).

- flushing the session (synchronises the objects in memory / cache with the database)
    - Avoid manually forcing the session to flush
    - Hibernate has no guarantees of when the sesison will be flushed to

- **Annotations** (Since Hibernate implements JPA specification, Hibernate supports any environment that uses Jakarta Persistance Annotations on top of providing their own Hibernate Annotations.)
    - Jakarta Persistence Annotations: `@Entity` (specifies a POJO as a JPA entity), `@Id` (specifies a field as the Primary Key), `@Column` (specifies details of a table's column, eg. `nullable`), `@Table` (to change table/relation name)
        ```java
        @Entity
        @Table(name = "DeadlineExtensions")
        class DeadlineExtension {
            @Id
            private UUID id;
        }
        ```
    - Jakarta Persistance Mapping Annotations (`@ManyToOne`, `@OneToMany`, `@JoinColumn`) to create relationships between entities.
        ```java
        // Bi-directional many-to-one relationship.
        // 1 feedback session to many deadline extensions.
        @Entity
        @Table(name = "DeadlineExtensions")
        class DeadlineExtension {
            @ManyToOne
            @JoinColumn(name = "sessionId", nullable = false)
            private FeedbackSession feedbackSession;
        }

        @Entity
        @Table(name = "FeedbackSessions")
        class FeedbackSession {
            @OneToMany(mappedBy = "feedbackSession", cascade = CascadeType.REMOVE)
            @Fetch(FetchMode.JOIN)
            private List<DeadlineExtension> deadlineExtensions;
        }
        ```
    - Hibernate Annotations (`@UpdateTimestamp`)

Resources:
- [Starting Guide for learning Hibernate](https://www.digitalocean.com/community/tutorials/hibernate-tutorial-for-beginners)
- [Hibernate: Entity lifecycle](https://www.baeldung.com/hibernate-entity-lifecycle)
- [Object States in Hibernate](https://www.baeldung.com/hibernate-session-object-states)
- [Hibernate: Caching](https://docs.jboss.org/hibernate/orm/6.2/userguide/html_single/Hibernate_User_Guide.html#caching)
- [Hibernate: Flushing the Session](https://docs.jboss.org/hibernate/core/3.3/reference/en/html/objectstate.html#objectstate-flushing)

Criteria API

Criteria API is used to construct type-safe queries that fetch entities from the database.
For the v9 migration, I learned and gained experience with using these queries to fetch entity / entities from the PostgreSQL database.
Such complex queries (built using `CriteriaBuilder`, `CriteriaQuery` and `TypedQuery` classes) include:
- using clauses (`SELECT`, `FROM`, `WHERE`),
- joining multiple relations (`JOIN`) and
- conditional conjunctions (`and`, `equal`, `greaterThan`, ...) provided by the API.

Example of a complex query:
```java
    // parameters: feedbackSessionId, userId

    CriteriaBuilder cb = HibernateUtil.getCriteriaBuilder();
    // specifies that the query should return DeadlineExtension object(s)
    CriteriaQuery<DeadlineExtension> cr = cb.createQuery(DeadlineExtension.class);
    Root<DeadlineExtension> root = cr.from(DeadlineExtension.class);
    Join<DeadlineExtension, FeedbackSession> deFsJoin = root.join("feedbackSession");
    // Joins deadline extension table with User table by deadlineExtension's user field.
    Join<DeadlineExtension, User> deUserJoin = root.join("user");

    // equivalent to SQL's where clause:
    // SELECT ... WHERE de.feedbackSessionId = feedbackSessionId AND de.userId = userId
    cr.select(root).where(cb.and(
            cb.equal(deFsJoin.get("feedbackSessionId"), feedbackSessionId),
            cb.equal(deUserJoin.get("userId"), userId)));

    TypedQuery<DeadlineExtension> query = HibernateUtil.createQuery(cr);

    // Streams in the results of query, and find the first or return null if none.
    return query.getResultStream().findFirst().orElse(null);
```

Docker

Docker provides services that allow us to run processes locally in containers (using processes). I wanted to do more research into Docker and understand more about how it worked when I was working on the onboarding task. This includes learning about the docker commands: `docker-compose` and how hosting and virtualisation is done on Docker.

Resources:
- [Containerization Explained by IBM Technology](https://www.youtube.com/watch?v=0qotVMX-J5s)
- [Virtualization Explained by IBM Technology](https://www.youtube.com/watch?v=FZR0rG3HKIk&t=64s)

Testing

I had previously learned about the concept of testing and the various types of tests but never had a chance to work extensively with it.

Over the course of the module when working with TEAMMATES, I realised the importance of testing and gained much valuable experience writing test cases (unit and integration tests in particular).

Additionally, I learned to use the following testing frameworks:
- **TestNG** (Automation Testing framework)
    I mainly used TestNG's Annotations to aid in writing unit and integration tests on TEAMMATES.
    For example:
    - `@Test`: to specify that the method is a test case.
    - `@BeforeMethod` and `@AfterMethod`: to specify that the annotated method must be executed before / after all `@Test` methods / test cases in the class.
    - `@BeforeClass`: to specify that the annotated method must be called before running all test cases in that class.
- **Mockito** (Mocking Framework)
    Mockito was introduced in TEAMMATES to make mocking and stubbing in unit tests easier.
    Mocking in Mockito:
    - When Mockito creates a mock, the object is completely "fake" (completely ); it never executes real invocations of the mocked class.

    I used Mockito to mock classes that the class being tested on has a dependency with, stub method calls (with `when`, `thenReturn`, `thenThrow`, etc) and verify interactions with the mocked class (with `verify` method).
        ```java
        MyClass expectedObject = new MyClass("id");

        myMockedClass = mock(MyClass.class);
        // define the behavior of mocked class: instead of executing the real method, the mocked method only returns the expected object
        when(myMockedClass.getObject("id")).thenReturn(expectedObject);

        // verify number of calls to stubbed method with mocked class
        verify(myMockedClass, times(1)).getObject("id");
        ```

    Additionally, I researched other concepts Mockito provides: `spy` (for partial mocking) and mocking of static methods (`mockStatic`), but never had a chance to work with them.
    - Spy:
        - partial mocking is done for Spy; ie. in comparison to mocks which don't execute the stubbed method ("only fake object exists"), for spy: some parts will use real method invocations ("a real object exists and we are spy-ing on it").

    - Mocking of static methods:
        - in older versions of Mockito, it's not possible to mock static methods, but with PowerMock (a third-party tool), it is possible in newer versions.
        - instead of `mock`, use `mockStatic` to create a mock for a static class.

    - `verify`
        - I initially ran into some issues with `verify` and did more research into the cause;
            In the Action-layer tests, since the mocked logic object is shared with all action test classes, the mocked object accumulates the count of calls to any of the mocked logic methods across multiple test cases.
            Work-arounds:
                1. create the mock whenever a new test case is called (like how it's done for logic and db layer classes) or
                2. use `clearInvocations` method to clear to reset the invocation counts for a mock, between test cases.
        - Takeaway: A mock will keep track of all its past invocations.

    - Mocking void methods:
        - Mockito's default behaviour for void methods: `doNothing` (ie. the mock does nothing, will not execute the real method)
        - If void method has some complex behaviour: can use `doAnswer` (to define custom behaviour) or invoke the real method (`doCallRealMethod`, but this creates a dependency in unit test case)

Resources:
- [TestNG Annotations](https://www.javatpoint.com/testng-annotations)
- [Mockito: Spy vs Mock](https://stackoverflow.com/questions/28295625/mockito-spy-vs-mock)
- [Mocking Static Methods with Mockito](https://www.testim.io/blog/mocking-static-methods-mockito/)
- [Handling void methods with Mockito](https://www.baeldung.com/mockito-void-methods)

Git

Over the course of the data migration, I learned how to rebase a branch and revert to a past commit.
Additionally, I did some research into other git commands used during development like force pushing a branch.
I think my knowledge and understanding of Git has improved greatly, beyond merging, `git pull`, `git fetch` and `git push`.

Github Editor

A cool tip / trick I learned from my mentors / peers: change ".com" in the github url to ".dev" OR pressing "." when on a pull request page will open up the web editor for PR, making it very easy to submit PR reviews and review the code.

PR / Code reviews

While reviewing pull requests for other maintainers, I realised there is still much I can learn about Angular, Bootstrap, and even about the codebase (eg. its structure). Hence to be able to give the best review and advice, I did more research into these technologies and concepts before submitting reviews. (Some of this research is in the "Frontend" part)

Project management

I realised over the course of the module that there are many facets to managing a large open-source project and it extends beyond reviewing PRs for contributors.
In particular, I learned how I not only have to ensure the code works, but whether the code written is consistent with the codebase, if there are any better ways to solve the issue and to provide help that guides the contributor and not directly provide the solution.

Ong Jun Heng, Cedric

Angular

Context

Having worked on CATcher for IWM last summer, I've gained familiarity with Angular. Working on TEAMMATES, I've deepened my knowledge of the framework, and there were two new things that I've picked up about Angular.

Pipes

Although I was previously aware of the use of pipes, I was not aware of the performance aspect. In particular, using pipes are much more efficient than methods to render strings:


<h1>{{ name.toLowerCase() }}</h1>

<h1>{{ name | lowercase }}</h1>

When using a method, it is always run whenever the component is detecting changes. However, for pipes, they are only run when the input of the pipe, in this case name, is changed.

Here is a medium article that dives more into the benefits of using pipes.

Angular Template

ES6 template literals and nested string interpolation aren't supported in Angular:


<div>{{ `(${text})`) }}</div>

This was something that I learnt from an open source contributor in this PR.

Hibernate

Context

Prior to taking CS3281, I've only used Java in CS2030S, CS2040S and of course CS2103T, never on a live system used by actual users. This was hence my first experience in writing a Java backend, and I'm glad that I got the opportunity, and I am confident that I'm now able to work on backend systems with Hibernate, from defining database tables, specifying entity relations, and writing queries. The following are a few aspects of Hibernate I'd like to highlight.

Entities

In Hibernate, each class created by the developer creates a corresponding table in the database. (with some exceptions, I'll get to that later) A typical Hibernate entity looks like this:

@Entity
public class Class {
    @Id
    @GeneratedValue
    private Long id;

    @Column(nullable=false)
    private String field;
}

There's quite alot going on here, so let's break it down:

  • On top of normal Java classes, Hibernate uses annotations (preceded with '@') to denote properties of classes and class fields.
  • The @Entity annotation identifies a class as an entity class, whose fields are to be persisted to the database.
  • @Id specifies the primary key.
  • @GeneratedValue is typically used for primary key columns, to denote that a field should be generated by the database upon object creation.
  • @Column is an optional annotation that allows one to customise the mapping between the entity attribute and database column. In this case, nullable=false ensures that the database column field for the table Class cannot have non-null values.

A database table corresponding to the class will be created, with columns id and field.

There are numerous annotations, and Thorben Janssen's guide gives a in depth overview of the most essential ones.

Entity lifecycle

Once the entities are defined, we can perform create, update and delete operations, and these effects will be persisted to the database.

For instance, when we have a Student class with a name field, and we would like to update it, we simply call the field's setter, student.setName("newName"); to update the student's name.

Hibernate will persist the changes to the database automatically, without the developer having to explicitly do so.

There's a great guide on the entity lifecycle here

Inheritance

Inheritance is a key feature of OOP languages such as Java, and is also supported by Hibernate.

There are a few different ways that inherited entities are mapped to database tables.

One of which is the SingleTable inheritance strategy.

As per its namesake, in SingleTable inheritance, all child classes are mapped to one table.

For example, the classes below,

@Entity
@Inheritance(strategy = InheritanceType.SINGLE_TABLE)
public class Class {
    @Id
    @GeneratedValue
    private Long id;
}

@Entity
public class ClassA extends Class {
    @Column
    private String description;
}

@Entity
public class ClassB extends Class {
    @Column
    private Integer quantity;
}

Will be mapped onto a single table, Class, with the fields id, description and quantity.

A drawback would be that we cannot enforce non-null constraints on any of the database columns, since for records of ClassA, they would have the attribute description but not quantity, and for records of ClassB, they would not have the attribute description but have quantity.

An advantage of SingleTable strategy compared to others is that there is no need for joins. For instance, in the Joined Table strategy, each subclass will have its own table, and when querying, it is joined with the parent class' table. This was one of the reasons why SingleTable inheritance was ultimately chosen for FeedbackQuestion and FeedbackResponse entities, as they had many subclasses.

Baeldung's guide on inheritance strategies was extremely helpful to me in understanding the differences between them, and the tradeoffs one needs to consider when choosing among them.

Testing

Context

Prior to working on TEAMMATES, the only exposure to software testing I had was in CS2103T. Working on the migration to postgresql involved writing tests, and through that I've gained a slightly better understanding of the different types of tests, and a much greater appreciation of the importance of tests. When migrating the system to postgres, having the old test cases provided us with some reassurance that the changes we made to the system would not impact the existing functionalities, which is absolutely essential for a live system.

Unit vs Integration testing

Unit testing is testing individual components, in isolation. For components with dependancies, these dependancies are mocked to ensure that any errors would be due to bugs in the component itself, and not its underlying dependancies. An example of this in TEAMMATES would be that when doing unit testing for the logic layer, which depends on the database layer, we mock the database layer.

Integration testing on the other hand, tests if the various components are working when combined together. Building upon the unit testing example, when doing integration testing for the logic layer, we would not mock the database layer, but rather have the database layer actually perform its operations.

Having both of these types of tests are necessary in a big system like TEAMMATES: unit testing gives us the reassurance that the invididual components are working on its own, while integration testing ensures that they work together. With good unit testing, we can be certain that any issues in integration testing is most likely due to the interaction between the components, rather than the component itself, making debugging easier. The tests together ensures that no breaking changes are introduced to the system, and is thus essential in a live system like TEAMMATES.

Here is an article that summarises the differences between unit testing, integration testing and

OOP patterns

Builder pattern

The builder pattern is useful when a class has many fields that are optional upon instantiation. Imagine a Java class that has 3 fields:

public class Foo {
    String a;
    Integer b;
    Long c;
}

For this class, say that a, b and c are optional, and that they are not needed when creating a Foo object. To cater for every combination, we would require many constructors:

Foo(String a) {
    this.a = a;
}

Foo(String a, Integer b) {
    this.a = a;
    this.b = b;
}

Foo(String a, Integer b, Long c) {
    this.a = a;
    this.b = b;
    this.c = c;
}

And many more for the other combinations...

To solve this issue, we can make use of the builder pattern, creating a static builder class inside Foo:


public static class FooBuilder() {
    public Foo setA(String a) {
        this.a = a;
        return this;
    }


    public Foo setB(Integer b) {
        this.b = b;
        return this;
    }

    public Foo setC(Long c) {
        this.c = c;
        return this;
    }

    public Foo build() {
        return new Foo(this);
    }
}

We can then create Foo, with a b or c being optional without having to create numerous constructors:

Foo foo = new Foo.FooBuilder().setA("string").setC(100).build();

Factory pattern

The factory pattern should be used when an object needs to be created, but the object to be created is dependant on criteria. The creation logic should then be encapsulated in a factory method.

A simple example would be creation of SomeClass below:

public abstract class SomeClass {
}

public class SomeClassA extends SomeClass {
    Integer a;
}

public class SomeClassB extends SomeClass {
    boolean b;
}

Say that SomeClass is required to be created, and whether we create SomeClassA or SomeClassB depends on an enum:

enum Type {
    A,
    B
}

Type type = Type.A;
SomeClass someClass;

switch(type) {
case Type.A:
    someClass = new SomeClassA(1);
    break;
case Type.B:
    someClass = new SomeClassB(true);
    break;
default:
    break;
}

It would be much better to encapsulate this logic in SomeClass:


public abstract class SomeClass {
    
    public static createSomeClass(Type type) {
        switch(type) {
        case Type.A:
            return new SomeClassA(1);
        case Type.B:
            return new SomeClassB(true);
        default:
            return null;
        }
    }
}

SomeClass someClass = SomeClass.createSomeClass(type);

This way, the creation logic is able to be reused throughout the application, and also any changes, such as adding a new subclass, can be more easily done, allowing for more extensible code. Here is a great article on why the factory method is useful.

Misc

Migrations

Observing how the migration from datastore to postgresql is carried out for a live system used by users worldwide, with no impact to them is pretty amazing to me. Our dualDB approach, where we still query the datastore for courses that are not migrated yet, ensures that functionalities are still avaliabile for the users, even when we make huge changes to the system. This is also known as a trickle migration.

Here is an article I came across when searching up on the types of migration strategies used.

Github web editor

Credits to Samuel for this, but I was previously unaware that github had a web editor. By pressing . on PRs, it opens the web editor which is extremely useful for reviewing PRs, especially those that make changes to large files, so that we can easily view the changes made with the context of the entire file, and also its various dependencies.

Git interactive rebase

I've learnt the use of interactive rebase, and how I can use it to rewrite my commit history. This is particularly useful when I would like to remove commits that are unncesary (for instance commits like fix checkstyle), and create a more meaningful commit chain for my PRs. Although the commits are squashed in the end when merged, I find that it is important especially for larger PRs to keep a meaningful commit history, to make it easier on reviewers.

Here is an article by atlassian that provides more details about the rebase command.

Github CLI

Prior to this module, I've never used the github CLI, sticking with just git commands. However, I found it very useful to use the CLI, especially when reviewing PRs, as it allowed me to checkout someone's branch with just one command, which github provides on the review page, rather than having to add their remote repo, fetching their branches and then checking out the branch.

Dominic Lim Kai Jun

Angular

Context

Before TEAMMATES, I have only ever used React. To help me get started on Angular, I looked up videos on YouTube, specifically Fireship's Angular playlist, to get an overview. I tried doing a Udemy course too but I thought it was a little far-fetched.

With a background in React, I went onto look for the similarities and differences between these two popular frontend web frameworks which led me to decide to dive into TEAMMATES' codebase.

Passing data between Parent and Child components

Similar to passing of props in React, Angular has its way to pass data between parent and child components.

In Angular, we use Output for sending data to parent and Input for sending data to child. It took me awhile to get used to the terms of in/output.

What helped me through this was the Angular docs on this exact matter, it was a perfect read! It starts off with the introduction of Input and Output, and was surprised it's said to be like a pattern. This page was really well written as it goes straight to the subject and it takes a step by step approach with sufficient amount of examples.

Services

Working on the onboarding task (Per Receiver Submission project), I have learned how Angular, a frontend framework, communicates with the backend that uses Java EE.

For the frontend to send a HTTP request to TEAMMATES server, we have to make use of a library/dependency to handle this action. Similar to packages used in React apps such as Fetch API, Axios, etc.

TEAMMATES makes use of Injectable as part of Angular's core package to create a service. In this case, a service to an entity class. An injectable service is created and in it consists of functions that work with HTTP requests e.g., A method to get entities, that calls another service, the custom written HTTP service.

Angular recommends to make use of services for tasks that do not involve the view or application logic. These services are mainly used to communicate with the backend server. Here is a guide on Introduction to services and dependency injection.

While working with HTTP requests, we need to handle the operations that are involved with each request sent. TEAMMATES backend uses a RESTful API architectural style. These RESTful endpoints mainly involve asynchronous operations.

To work with such operations, we use Observables to ensure we resolve or reject them properly. Observables are part of the RxJS library.

With Observables, we are able to not only handle the basic outcomes of calling these RESTful APIs, but we are able to chain each response that is returned to us and make use of it to perform further operations.

Testing

I have never written tests of this extent. Aside from CS4218 which I am currently reading, the work done in TEAMMATES has really helped me to improve the way I write tests, and fully understand the importance of tests.

Spy-es/Spies

I have used spy before. Ironically, I never knew how to use it properly till I had to write tests on the work done.

I was struggling to figure how to pass a check in a method of this object, let's define object as A. Object A has a method, a(), that has a condition in it if b(...).

Method b() belongs to object A. I could not set this condition to be true when I was writing the test. However, spy did the trick!

All I had to do was write this powerful line of code:

A spyA = spy(A.class);
doReturn(true).when(spyA).b(...);

And it worked! Sounds pretty trivial and silly I know... But Today I Learned (TIL)!

Spying on an object allows us to dig deep into its methods and intentionally set the outcome of what we expect a variable/object or method outcome to be, we are in control and we define the result.

Here is a good read on spies. Love baeldung!

Integration Tests

My mentor said something that will have me remember for life - "We should not use mocks for Integration Tests.". This was a 'Today I Learned' moment.

Hibernate

Context

I have only ever used Java in my school work, mainly those small OOP projects. If I can recall, the extent was up till using JDBC connecting to MySQL, building a MVC architecture, that was pretty much it.

But wow! Java has its own framework for building a backend. I have also heard of Spring Boot (I believe it's another backend framework!) but never used or looked into it before.

At least for me, learning Hibernate has been really eye opening and refreshing! It has not only expanded my technical skillset in the realm of Java but it drills me on my fundamentals of Object Oriented Programming.

The evidence is that Hibernate expects developers to write out the entity classes in an OOP fashion and it does all the setup behind the scenes, e.g., Setting up of the schemas in the chosen database.

Annotations

These said classes contain annotations provided by Hibernate where we specify what we would like to see in our relational schemas from non-nullable columns, connecting schemas via associations and foreign keys, working with natural keys, etc. Pretty cool and interesting stuff!

Database Functionalities of Hibernate's API

Enough of the OOP stuff! Let's dive into the database functionalities.

Hibernate provides out of the box in-built database functionalities in its API. These functions are closely similar to how we write SQL queries.

Let's take a look at an example.

Here, we have 2 entities - Instructor and Account. An Account is linked to an Instructor via a One to Many relationship i.e., An Account to many Instructors, and an attribute is in the Instructors schema. Hibernate does the work of linking these entities via the association we just specified.

We would like to find all Instructors of a specified accountId and courseId.

In Hibernate, this is what we will write:

cr.select(instructorTable).where(cb.and(
        cb.equal(instructorTable.get("courseId"), courseId),
        cb.equal(instructorTable.get("accountId"), accountId)));

In native SQL, this is what we would have written (PostgreSQL format, might not be entirely correct, off the top of my head):

SELECT * FROM instructors
    JOIN accounts
        ON instructors.account = accounts.accountId;

Hence, Hibernate provides plug-and-play functionalities that closely relate to SQL operations.

Data Persistence through JPA

Another thing to highlight is the data persistency that Hibernate promises.

Hibernate is a standard implementation of the Java Persistence API (JPA) specification.

An evident example on persistency is...

For example, we have a Person that we would like to update his/her name. Since we have written Person class in an OOP fashion as mentioned above, we could simply just update the name via the setter of the name attribute by person.setName("NewName");, and that's it!

You might ask, "How about telling your database above this person's name change?".

Hibernate does this behind the scenes for you! This is all thanks to JPA.

Also with the help of unit tests between the Logic and Db Layers, I was able to guarantee that this worked.

Cascade Types (Delete)

In SQL systems, when we delete a parent entity that references to a child entity or a list of child entities, all the child entites get deleted along with it.

To do this in Hibernate, we can make use of a few different types of annotation, each has its own specific use.

Specifying cascade = CascadeType.REMOVE when declaring the association between 2 entities, parent and child

@OneToMany(mappedBy = "parent", cascade = CascadeType.REMOVE)
private List<Child> children = new ArrayList<>();

This snippet says many child entities belong to one and only 1 parent. When we delete this parent entity, we'd want to also cascade this deletion/removal onto all of the child entities linked to this parent.

This is the go-to for a simple cascade removal operation.

Specifying orphanRemoval = true when declaring the association between 2 entities, parent and child

Firstly, this is very similar as to what we have above.

@OneToOne(mappedBy = "parent", orphanRemoval = true)
private Child child;

The difference is that orphanRemoval acts on the address of the parent entity. In other words, when we have a single child entity that references to a parent entity and we happen to use the setter method of this attribute child.

I.e., Doing this.

parent.setChild(null);

Hibernate will automatically detach the child entity from the parent entity and when a child entity is left alone without a parent, orphaned in this case, Hibernate will remove this child entity from the database.

Using @OnDelete(action = OnDeleteAction.CASCADE)

I found this out when I was figuring how to remove an entity that does not have any association/relationship to the current entity except a 'reference' like this:

public class SomeClass {
    private ClassWeDelete classWeDelete;
}

This means that there is no purpose in keeping an entity of SomeClass when it only makes sense if an entity classWeDelete exists. It really does sound as if it's overlapping with the above two scenarios.

But again, the key here is that there is no form of association/relation between these two different entities.

Here are some resources that helped me on understanding this:

Inheritance Strategies

As mentioned above, Hibernate as a framework works with OOP heavily. To start off, in order to model the relationship between entities, we need to define the java classes in an OOP fashion. Hibernate as an Object Relational Mapping tool does the heavylifting for us - we just need to tell it what we want in the OOP 'language' that Hibernate interprets really well in.

I picked this up when I was tasked to design the parent-child relationship for the User's entity (in this PR #12071). User is the parent entity that has child entities i.e., These child entities extend this parent User class.

When considering which strategy to use, there were other factors to consider - performance, scalability, maintainability.

These were what I found during the research of the 4 inheritance strategies that Hibernate provides.


Below are the findings for the 4 available inheritance strategies in Hibernate:

MappedSuperclass

  • Not scalable. If we use this, ancestors cannot contain associations with other entities.
  • Each DB table contains both base and sub classes properties.

Single Table

  • Performant (polymorphic query performance) as only 1 table needs to be accessed when querying parent entities. Best performance among the other strategies.
  • However, can no longer use NOT NULL constraints on subclass entity properties. I think this means that the identifying attributes of the rows of subclasses can be nullable?
  • For our case of Student/Instructor IS-A User, we will only have 1 table in the DB, i.e., User, with all the fields combined. This means Single Table is out of our options? Student is a User but it shouldn't have Instructor attributes/data.
  • Could use discriminator values which is used to differentiate between rows belonging to separate subclass types.

Joined Table aka Table-per-subclass mapping strategy

  • TLDR: An inherited state is retrieved by joining with the table of the superclass.
  • Great because the PK of User appears in its child classes, e.g., In Student/Instructor.
  • Performance issues as retrieving entities requires joins between the tables, expensive for large number of records. Number of joins is higher when we query parent class as it will join with every single related child.
  • Discriminator column not required. But each subclass must declare a table column holding the object identifier (which is kind of what we want as each subclass table contains the FKs to the base class, User; PKs of sub class tables are also FKs to the superclass table PK. If @PKJoinCol not set, the PK/FK cols are assumed to have the same names as the PK cols of primary table of the superclass).
  • When using polymorphic queries, base class table must be joined with all subclass tables to fetch every associated subclass instance.

Table per Class

  • Performance issue because we union in the background when we query the base class i.e., User.
  • Each table defines all persistent states of the class, including the inherited state.
  • If we want to use polymorphic associations (eg An association to the superclass of our hierarchy), we need to use union subclass mapping. This requires multiple UNION queries, be aware of performance implications of a large class hierarchy. Also, not all database systems support UNION ALL. PostgreSQL81Dialect supports UNION ALL.

Some resources I used to help me work out the different inheritance strategies


Conclusion

Hibernate does provide way more than the above but let's look forward to what else we can learn in the future tasks!

Alright that's it for now, stick around folks!

Resources

Below are some wonderful resources that have helped me along the way:

  • Baeldung's take on Hibernate here
  • Official Documentation of Hibernate here, literally the bible but I find some examples to be quite bare and the rest of the resources here and have helped me tremendously!
  • The man himself Thorben Janssen! Over here

Miscellaneous

Large Modifications to a Live System

Crediting my teammate, Kevin, for bringing up this point.

This semestral project, v9-migration - moving from NoSQL to SQL, could be detrimental to a live system that relies on huge chunks of data to function if not done with caution.

In order for the team to carefully perform such a huge operation, a migrated check on the documents (previously NoSQL) is done before paving the way for an endpoint to take when called for.

This allows us to work independently on migrating the relevant parts of this live system, dual DB as said by my mentors, without affecting the current state that the hundreds of thousands users see.

In my opinion, it is really neat!

A good follow-up after submitting a PR

Through the PRs that I have submitted, I learned that we should always set it as a draft first and look at the PR from the reviewer's perspective to ensure that we did not miss out anything or if there is any section of code that can be further improved on.

A helpful part of the UI for a PR in GitHub is the Files Changed tab. This allows us to get a full overview of the changes that were made, additions or deletions, to each file.

This can definitely go a long way, especially when we have a large PR.

A different view when looking at a PR

Crediting my mentor, Samuel, for introducing this.

When looking at a Pull Request on GitHub, tapping on the . key on our keyboard brings us to another page - GitHub's built-in Visual Studio Code in the web browser. Extremely cool stuff!

GitHub CLI

Prior to reviewing PRs in TEAMMATES, I haven't really experienced any trouble with pulling the branch I am reviewing locally.

I faced troubles in doing so when reviewing my first TEAMMATE's PR. The reason was that this branch in the PR resides in the developer's own fork.

The process was different as to what I have always been in when reviewing a PR - was a remote branch that resides in the same repository and not a fork.

My first go-to solution to this was to add the developer's fork as a remote repository for me to track locally by using (some parts of this guide helped!):

git remote add ANY_FORK_NAME FORK_URL
git fetch ANY_FORK_NAME
git checkout BRANCH_IN_PR

This worked fine! But at one point, I could not get this working.

I spent quite awhile to search on why but to no avail.

There was a suggestion on Stackoverflow that proposed others to use the GitHub CLI.

This was a plug-and-play tool, especially developing on Windows some setup could be rather cumbersome. All I had to do was visit the link above and install this CLI. Next, it was just a single line command...

gh pr checkout PR_NUMBER

This was a life saver! Checking out to the developer's branch in his/her fork and reviewing a PR now have never been simpler!

WU QIRUI

Angular

  • Components are the main building blocks of Angular applications. Each component contains an HTML which specifies the template, a CSS which specifies the style, a TypeScript which specifies the behaviour of the component, and possibly a module file which specifies the modules used by the component.
  • Each Angular component has a lifecycle. The lifecycle starts when Angular instantiates the component class and reders the component view. There are multiple lifecycle hook methods that can be used in our application to respond to lifecycle events. For example, ngDoCheck allows developers the customise change-detection, meaning when there is a change to the component, this method will detect the changes and perform some operation specified by the developers.
  • In Angular applications, we can pass data between parent and child components using @Input() and @Output() decorators. More specifically, in the child component, we can decorate the property with @Input() and in the parent component template, we can use property binding to bind the property of the child component to the property of the parent component using the square bracket, []. In this way, data can be passed from the parent to the child. Conversly, to send the data from the child to the parent, we can use the @Output() decorator and an EventEmitter. The parent component template the normal bracket, (), for event binding. When we trigger the EventEmitter to emit the event, the event will be passed to the parent component along with the data.
  • Apart from property binding and event binding, there is also two-way binding which uses [()]. Using two-way binding can listen for events and update values simultaneously between the parent and child components.
  • Angular has directives which are classes that add additional behaviour to elements in Angular applications. For example, some of the most commonly used directives, ngIf, ngFor, (which are structural directives) allows developers to write if-else statement logics and for loops in the templates, so that we do not have to write repeated codes.

Recourses: Angular documentation

RxJs

  • HTTP responses to the frontend is is the form of RxJs Observable. Upon receiving the HTTP responses as Observable, we can call the subscribe method to manipulate the HTTP response. We can also call the pipe method to add more custom methods for manipulation. For example, we can pass in some custom finalize method into the pipe method to specify the end operation.
  • HTTP responses received as Observable is asynchronous. This means the code is not executed sequentially. The line of code right after the subsribe method may be executed first. This is also a reason why it is desirable to pass in finalize methods.
  • In the scenario where we want to manage multiple Observable and synchronise them, we can use forkJoin from RxJs. forkJoin taks in an array of Observable and process them one by one. In this way, we would not need to worry about the synchronisation between each Observable.

Docker

  • Docker is a tool that is used to automate the deployment of applications in containers so that applications can work efficiently in different environments. TEAMMATES uses docker to deploy sub-services which include Apache Solr, Google Datastore, and PostgreSQL. Using docker can ease the environment setup for all developers.
  • Different applications are packaged as an image which can be downloaded by developers. Developers can then start the start a container based on the image.
  • Multiple services/applications can be started with one command line. Configuration of the difference services/applications is specified inside the docker-compose.yml file.
  • Specifically, to access the docker container, we can access through the specified port number. For example, when the port is configured as 5432:2345, it means inside the container, the application expose the port 5432 and docker will connect the exposed port to the port outside the container which is port 2345. Therefore, applications outside the docker will just need to access port 2345 to access the container application.

Hibernate

  • Hibernate ORM (object-relational mapping) is an ORM tool for converting data between a relational database and the heap of an object-oriented programming language.
  • With Hibernate ORM, developers do not need to write plain SQL query to query the database. Instead, developers can call built-in Hibernate methods to perform database operations, such as create get, join, update, delete, and etc. This could also help prevent SQL injection attack.
  • To trun a Java class into a database entity, we can make use of the Hibernate ORM annotation to specify the class is an entity or a table in the database by using @Enity, @Table, and the fields inside the class are the columns of the table by using @Id, @Column, etc.
  • Hibernate ORM can also perform join operation. Developers will just need to use the Hibernate mapping annotation, such as @OneToMany, @ManyToOne, etc. When fetching the entities from the database, Hibernate will also fetch the associated entities without developers explicitly doing so.
  • Session-per-request model: Hibernate's Session object is not thread-safe, so we should not be sharing a session across multiple threads. On the other hand, creating a new session for each database operation is expensive. Session-per-request model can help alleviate the problem where all database operations to be performed in a single request are wrapped in one transaction. A request can be seen as one atomic unit of operation.

Qiu Jiasheng, Jason

Angular

TEAMMATES uses Angular as its frontend framework. Some Angular features I learned:

  • Mostly fixed folder structure (which is different from React.js):
    • HTML template
    • TypeScript class
    • CSS
    • Module: Used to organize an application and extend it with capabilities from external libraries
  • *ngFor: Repeats element given a list defined in component class
  • *ngIf: Conditional rendering
  • Two-way Binding: Listen for events and update values simultaneously between parent and child components
    • Notation: [()]
    • Works together with @Input and @Output tags
  • Pipes: Used to transform data displayed in templates

Resources:

Web Accessability

Context

Before this project, I did not know what web accessibility meant or what even a screen reader was. This project showed me a new perspective on how we can help people with audio and visual impairments to navigate the web with ease.

How to Achieve Web Accessibility

4 Main Principles: Perceivable, Operable, Understandable, Robust

  • Interactive elements should be focusable by the keyboard (e.g. when tabbing)
  • Tabbing order should match visual layout of the page
  • Support for screen reader:
    • All inputs should have corresponding label
    • Labels should be descriptive for the screen reader to read out aloud

Implementation Details

  • Add aria-label for custom labels
  • Use <h1> to <h6> tags for headings
    • Rationale: Screen readers can pick them up for more accessible navigation
  • <b> should be avoided
    • Rationale: Standardize using CSS for styling
  • Tables with two headers:
    • Problem: Difficult for screen reader users to understand relationship between header and data cells
    • <th>: Indicates header cells
    • Use scope attribute to designate header cell as the header for column or row
    • When navigating the table's data cells using the screen reader, the screen reader reads out the least recently visited row/col header cell
  • Bind <label> tag to label related element using for attribute

Resources:

Bootstrap

  • Prioritize using bootstrap utility classes over custom CSS
    • Rationale: Bootstrap is responsive by default. Minimize CSS customization.
    • E.g. table-responsive
    • Caveat: Bootstrap is not invincible! There were instances where Bootstrap utility classes cannot be used (e.g. min-width)
  • Breakpoints: Used to achieve mobile-responsiveness
    • Class infix: sm, md, lg, xl, xxl
    • Most Bootstrap utility classes support class infixes for mobile responsiveness
  • Grid system: Mobile-first 12-column system

Resources:

Snapshot Testing

While updating the frontend for the User-friendliness Hero project, many snapshot tests were updated. It was interesting to see how static HTML web content can be tested using snapshot testing.

  • Use auto-update mode when updating snapshot tests

End-to-end Testing

E2E testing tests the application as a whole.

Selenium

TEAMMATES uses Selenium for E2E testing.

  • Selenium: For automating web applications for testing purposes. Some features include:
    • Navigating and clicking on elements
    • Inputting data into input boxes
    • Finding elements by id and class name: Used to verify state of the UI
  • However, Selenium and ChromeDriver are unstable. Thus, the tests are not deterministic and may fail at times.

TEAMMATES uses Apache Solr to support full-text searches for instructors, students, and account requests. Here is what I learned about the search feature:

  • Solr keeps a separate storage of searchable fields in the form of documents
    • Rationale: Though more space is needed, the search can be more optimized and customized
  • This means that Solr documents need to be added/edited/deleted if the associated database entity changes

Git

  • Fetch a remote PR into local repository: git fetch upstream pull/XXX/head:prXXX (Credits to Wei Qing!)
    • Before: I always add a remote to the fork, then pull from the fork, which is quite troublesome
  • Interactive Rebase: git rebase -i <base>
    • Pros:
      • Maintains clean and linear history
      • Avoids adding redundant merge commit
    • Cons: Need to be careful to not rewrite history by force pushing

Resources:

Kevin Foong Wei Tong

Frontend

Angular (Frontend framework)

I learnt Angular from having zero prior knowledge as part of CS3281.

Having previous experience with React, it is interesting to see the different approaches between frameworks and use the frameworks to implement functionality such as sorting, search etc. For example, Angular splits the CSS, HTML and JS into 3 distinct files while React mixes JS and HTML in a single file for each component.

Frontend optimization techniques (Debouncing, throttling)

I also learnt about and implemented some frontend optimization techniques such as deboucing and throttling so that queries to the backend for not made for every input in the search bar. Compared to React, Angular has more built in functionality such as debouncing function already in-built into the EventEmitter whereas in React we would have to use 3rd party libraries such as Lodash.

Backend

Presentation-Logic-Data Layering

I learnt more about writing well abstracted code and in how to structure a backend system by splitting the system into distinct layers for each responsibility. I was also able to discuss of how to improve and abstract responsibilities with my mentors to improve the software architecture of TEAMMATES.

Working on TEAMMATES led me to understand and discover more code architecture such as PresentationDomainDataLayering shared by Martin Fowler.

For example, thanks for this code architecture, we are able to substitute out our database layer from Datastore to PostgreSQL with Hibernate without having to make extensive changes to the other layers! Working on TEAMMATES has allowed me to greatly appreciate the benefits of implementing well-designed code architecture and its effects on maintenability of software.

Exception wrapping

I learnt and implemented the practice of exception wrapping (related to the above layering concept) to abstract lower level details from higher level components.

For example, I delegated the DB layer be responsible only for database operations. Exceptions thrown during the primitive DB layer operations are wrapped in the logic layer into relevant logic layer exceptions so as to abstract away the details of data access from the rest of the application.

I referred to this online guide by Jenkov on Exception Wrapping.

How to ensure product stays live even during migration

Working on a product which serves live users every month, I learnt how we can migrate from the existing GCP Datastore product to PGSQL while still serving actual users (without downtime). This can be done through implementing logic in code to check if the entity has been migrated yet, allowing the code to be able to serve users throughout the migration process.

Also, verification of the changes are important, hence a staging environment needs to be used to test and verify before redirecting traffic to the newly deployed database migrated version of TEAMMATES.

Hibernate ORM

I've learnt how to use Hibernate ORM and understood the layers of abstraction it provides for us. Hibernate helps map entity classes to actual records in a database. Each entity has 2 main states, managed and unmanaged. If changes are made to managed entities in hibernate, they will be flushed at the end of the session and do not need to be explicitly flushed.

Debugging test case failures due to Hibernate

The entity states initially added some complexity to updating entities in the database using Hibernate. I realised that the original entities are cached in the persistence context. However, after performing some updates despite first flushing the updates and then fetching them during testing, the joined entities were not found. This led to failing test cases.

Upon consultation with my mentors Hieu and Samuel, I realised that this was caused due to only updating relations from 1 side in the Entities which is intentionally set currently (due to circular bidirectional dependencies causing cascade errors). This caused the persistence context to not be updated despite the database being correctly updated.

Hence, I learnt and implemented HibernateUtil.getSession.clear() to allow for the managed entity cache to be cleared and hence ensure that the entities are fetched from the updated database.

By using flush then clear, this ensures that the entities read transaction caused by clear is after the write transaction caused by flush, hence this results in a view serializable transaction schedule which fixes the test cases.

Hibernate flushing behavior

Also, I learnt that flushing does not cause the transaction to commit and only orders the SQL statements to be queued for execution. However, it is not a good practice to force a commit for each transaction as it would negatively affect the performance and hence Hibernate's rationale for not mandating a commit after each flush (commits only occur after a certain number of transactions). Stackoverflow post.

Eager and Lazy loading concept

I've also learnt that in Hibernate, eager loading means that related entities are loaded from the database along with the main entity, while lazy loading means that related entities are only loaded when they are explicitly accessed. Here are some examples:

  • Eager loading:
@Entity
public class DeadlineExtension {
    @ManyToOne(fetch = FetchType.EAGER)
    private FeedbackSession feedbackSession;
    // ...
}

In this example, the FeedbackSession entity is eagerly loaded along with the DeadlineExtension entity, meaning that whenever a DeadlineExtension is fetched from the database, its FeedbackSession is also fetched.

  • Lazy loading:
@Entity
public class DeadlineExtension {
    @ManyToOne(fetch = FetchType.LAZY) // this is the default we are using in TEAMMATES
    private FeedbackSession feedbackSession;
    // ...
}

In this example, the FeedbackSession entity is lazily loaded, meaning that it is not fetched from the database until it is explicitly accessed. This can help improve performance by reducing the number of database queries made during the application's lifecycle.

Optimizing queries with SQL and abstracting SQL with Criteria API

I have also learnt how to use CriteriaAPI to create SQL queries and even write dynamic queries to reduce the number of queries needed to fetch a large number of records based on dynamic selection predicates.

For example, using SQL allowed me to optimize some of the previous queries. Instead of making n queries to fetch n records previously, I learnt to use CriteriaAPI which provides an abstraction over SQL to write a dynamic query as follows:

List<Predicate> predicates = new ArrayList<>();
        for (String userEmail : userEmails) {
            predicates.add(cb.equal(instructorRoot.get("email"), userEmail));
        }

        cr.select(instructorRoot)
                .where(cb.and(
                    cb.equal(instructorRoot.get("courseId"), courseId),
                    cb.or(predicates.toArray(new Predicate[0]))));

As shown, CriteriaAPI allows us to specify a list of predicates as a selection condition for generating dynamic queries, while abstracting and not requiring the programmer to use SQL directly.

General SWE skills and tools

Docker

Prior to TEAMMATES, I had used Docker but was not able to fully appreciate its utility. Working with Docker in TEAMMATES allowed me to both learn how to debug and appreciate the utility of container technologies such as Docker in Software development.

Several of the benefits and learnings are:

  • Streamlining developer set up experience across different platforms. As my team and myself were using different computer architectures and OS platforms, Docker's cross platform and run anywhere as long as the docker engine is installed allowed us to share new container configuration such as when we set up PostgreSQL for the migration as part of the docker-compose.yml file. Hence, I realised that Docker is a useful tool for improving the developer experience and makes program set up much simpler.

  • Debugging with Docker: In order to verify and access PostgreSQL, I learnt that debugging can be done by logging into the container: When using Docker, it is common to run services inside containers. If issues with a service are encountered, I was able to log into the container and debug the service from there. This can be done using the docker exec command, which allows me to interact with PostgreSQL inside a running container.

Gradlew

After bringing up Issue #12020, I reviewed the solution from an open-sourced dev. However, to guide the dev to be able to fix the issue (as seen in the thread), I also learnt more about how gradle is configured and more about jobs. Hence, working on CS3281 allowed me to understand how to build and configure gradlew to run desired jobs such as lint and test.

Testing

I've learnt of how to better utilise various tests such as unit and integration tests.

Mockito and its importance for Unit testing

I've learnt how to effectively use Mockito to mock lower layer and third party dependencies, use spies to ensure certain methods are invoked etc to isolate a specific unit to be under test.

For example, using mocks and spies are essential to isolate the software under test into small specific units.

Using Mockito, I was able to mock and perform multiple powerful assertions such as verifying that methods were invoked for a number of times with specific arguments etc.

verify(<object>, times(<number of times>)).<method>(<argument>);

Importance of testing

For example, by writing tests for the various FeedbackSession actions I've migrated, I've discovered bugs that were previously undetected and merged into the main branch. For example, since the various entitys' toString() method were invoking each other, it caused a cyclic infinite loop which would have crashed the application if pushed into production.

Through unit and integration testing, I also discovered other hard to detect bugs such as incorrect parameter orderings and CriteriaAPI bugs where the wrong key was being referenced and where Join needs to be explicitly indicated first that have been previously merged. This would not have been possible without good testing practice and identifying test cases where finding bugs are most likely.

I have learnt how to analyse the written code to find places with higher cyclomatic complexity/complicated logic etc to find places of interest to focus writing test cases for to identify bugs at a higher percentage.

Utility of static analyzers in detecting bugs

Through my CS3281 journey, I also realized how static analyzers such as pmd are not only able to enforce coding standards, but are also very effective in finding where potential bugs are. For example, pmd highlighted areas where null was possible and allowed me to discover and fix several bugs.

Architecture testing with ArchUnit

Prior to CS3281, I was unaware of the existence of Architectural tests.

I've learnt these tests can be very useful in enforcing the interactions and layering of our software architecture. For example, TEAMMATES does not allow logic classes directly interacting with the UI or the UI interacting directly with the storage classes etc.

Using Architectural tests can help us enforce the architecture of our application (eg, n-tier architecture with UI, logic, database).

I managed to adapt the existing architectural tests to fit the new requirements where instead of having an atttribute DTO class, we are using the SQL entity class directly to pass data. Hence, I learnt more about architectural testing through writing DescribedPredicates to fit our needs.

For example, ArchUnit uses a declarative style of defining the test cases:

noClasses().that().resideInAPackage(includeSubpackages(LOGIC_PACKAGE))
                .and().resideOutsideOfPackage(includeSubpackages(LOGIC_CORE_PACKAGE))
                .should().accessClassesThat(new DescribedPredicate<>("") {
                    @Override
                    public boolean apply(JavaClass input) {
                        return input.getPackageName().startsWith(STORAGE_PACKAGE)
                                && !input.getPackageName().startsWith(STORAGE_SQL_ENTITY_PACKAGE);
                    }
                })
                .check(forClasses(LOGIC_PACKAGE, STORAGE_PACKAGE));

In the above code, I was able to specify that no classes in the logic package should directly access the previous Google Datastore storage files, but can directly access the new SQL storage files (which is required since we have refactored the codebase to no longer use unnecessary Attribute classes and to use the Hibernate entity directly)

Regression testing

Throughout my CS3281 journey, I was exposed to and wrote tests for regression to ensure new changes did not break existing code.

I've learnt that regression testing is extremely important. Especially when there is collective effort in performing migration work and there are many changes, we need to ensure new changes don't break existing functionality.

Also, working on a larger team in CS3281 emphasized this importance, since regressions are usually more likely when there are more developers working on the same code.

OSS project maintenance

Working with international devs

I gained new experience learning to work with open-sourced devs from across the world. I helped guide and provide technical support and feedback and raised issues.

Github Actions for automation / DevOps

Also, from TEAMMATES, I have learnt how tools such as Github Actions can be used to maintain a large-scale project, such as using the OSS-bot for regular maintenance updates and suggestions.

SWE/Project management best practices

Splitting PRs

I learnt of the importance of splitting PRs across smaller PRs so reviewers could review code easier. As a reviewer and someone writing PRs, I've grown across the weeks as shown by splitting the PRs into smaller PRs for easier reviews. Though this might seem like a small tweak, I believe that my experience in CS3281 has made me better as a team player and software engineer.

Communication skills

I learnt of the importance of keeping the entire team updated with what each member is doing and the importance of sprint planning and standup meetings. For example, having regularly procedures for updates allows the team to better understand what everyone is doing and prevents potential duplicate/missing work. During CS3282, I hope to recommend and practice having standup meetings with my teammates.

SWE practices

I learnt some SWE practices and proposed changes to make the code quality better. For example, I discussed with my mentors of best practices and refactored functions that existed solely to throw an exception vs returning a boolean. I learnt and discussed different SWE ideas and did research such as considering which SWE practice is best.

Improving my fluency with using SWE tooling

Debugger

From CS3281, I've discovered and faced multiple bugs during my implementation journey.

I've managed to use the debugger extensively, stepping in/out and over lines of code and setting breakpoints and monitoring variables. For example, when implementing PR #12360, I managed to discover that certain conditionals were incorrect by monitoring the number of DeadlineExtensions in each map/list and correcting the conditionals using the debugger.

Shortcuts

Also, I've learnt many shortcuts to improve my developer workflow.

VSC shortcuts

  • Ctrl + P for file search
  • opt + click for multi-cursor etc
  • And many more

Terminal and keyboard shortcuts

  • To delete words and navigate quickly (opt + backspace, opt + arrow keys)
  • Deleting entire sentences (alt + u)

Github's editor

Can be accessed by changing .com to .dev or entering the . character on the keyboard. This has been very useful for comparing files between branches and reviewing code.

Neo Wei Qing

Frontend

Angular

I previously had no experience with Angular, and so working on the frontend proved to be quite an uphill task. My main resource for learning Angular was the Angular documentation, where I was introduced to components, templates, directives, and dependency injection. I also referred to several other guides, particularly for parts such as observables (e.g. EventEmitter) which I found more difficult to understand.

Through learning about and getting more hands-on experience with Angular, I came to appreciate its transmission of data between components as well as its conditional and looped rendering of webpage elements. The lifecycle hook methods were also useful in determining when certain logic was to be run. I also found the separation of HTML, CSS, and TS files to be clean and easier to understand.

I also learnt about how using certain features of Angular can help application performance, such as pipes, which is efficient and only run when the pipe input is changed. (However, ES6 template literals and nested string interpolation aren't supported in Angular).

Web Accessibility

I learnt a lot about why we need web accessibility and what we can do to make our application more accessible. A lot of the resources I learnt from were those shared by our mentor, which also detailed best practices and other design considerations.

Screen reader: A screen reader is useful to those who have their vision impaired. It reads out the elements on the webpage and allows users to navigate through them, press on buttons, etc. This means that for a user to have a good experience, our application must be designed such that they know what is on the webpage and can navigate through it easily.

Tab order and headings: One way to help navigation is to ensure that tab order and headings are correct. In particular, tabbing should be in logical order of how the field is presented on the page, and the user should not be able to tab to elements that are hidden or not visible at the moment. For headers, headings should be used to help users quick navigate between sections, and should be used in order without skipping numbers, so as to avoid confusion.

Aria attributes: There are many aria attributes, but the ones mainly used are aria-label, aria-role, and aria-hidden. aria-label labels an element so that the screen reader knows what to read when the user is focusing on that element. In the same vein, it is important to attach (descriptive) labels to fields for the same reason. aria-role is used to tell the screen reader what role an element plays in the webpage, so that it knows what to tell the user. Finally, aria-hidden, when set to true, tells the screen reader to skip past this element. This might be desirable because sometimes we don't want images or small icons to be captured by the screen reader.

Mobile Friendliness

I learnt about mobile friendliness and how to design a webpage in a way that is able to fit and work well in various screen widths. One way to do so, and in fact the way we mostly use, is by using Bootstrap's breakpoints, which helps to set styles depending on the screen width of the device.

Certain elements need to be redesigned to fit smaller screenwidths. The elements that I learnt about, and read about different design considerations, are modals and tables. In particular, there are several guidelines that specify how each should be laid out.

When thinking about how the frontend should look like, it is important to treat it from a user's perspective. Certain issues might not come to mind if we think of it purely as a developer, e.g. buttons need to be spaced further apart on mobile as putting them closer together makes it more difficult to press using fingers (as opposed to when using a mouse).

Backend

I learnt about how to properly migrate data across different databases while ensuring that the system stays up and operational. It felt that this way of managing the dual databases required a lot of effort in implementing double logic in each action, entity, etc, but since most of the extra code would be retained, I found this to be a good approach.

Hibernate ORM

I didn't learn about Hibernate ORM in much detail as I was mainly involved in user-friendliness. What I learnt from the mentors was that Hibernate helps to map entity classes to database records (along with annotations that provide useful information about the fields, such as @Nullable), and also about entity states and session flushing.

In addition, I learnt about Hibernate's methods for database operations, and how these methods can be used in place of SQL queries for reasons like safety and performance.

Docker

Docker makes starting services easier, and enables placing applications into various containers. I briefly looked into how Docker is used, in particular how the docker compose command works and how configurations can be specified inside the docker-compose.yml file.

Others

Testing

Unit testing: I learnt about unit testing, which is testing individual components in isolation, when looking through the backend code for the onboarding task and later for the migration. I also learnt about how dependencies should be mocked so as to properly test for bugs in the current class without interference from other classes.

Integration testing: I wrote a few integration tests for the migration, which tested if the various classes were working together as intended. This usually involved ensuring data or errors are properly transferred across the backend layers (e.g. database, actions, etc).

Snapshot testing: These were the tests that I worked with the most, and are mostly frontend focused. I found that they were most useful in helping me determine whether a change in code resulted in a similar change in the webpage. This meant that looking through the potential changes in snapshot tests were crucial in ensuring that these changes were to be expected.

E2E testing: E2E tests help to test the application from end to end, as the name suggests. This involves opening the browser, navigating to the page, interacting with elements, verifying state of webpage and database, etc. The scale of the test means that it takes a long time to run. I also learnt about the instability of the tests, and why it usually had to be run multiple times in order to pass. A skill I learnt was to analyse the error and determine from there whether the test was failing because of code changes or because of instability.

Importance of testing: While the importance of testing is always quite intuitively understood, I think that seeing it in action makes its importance more pronounced. This is especially the case for regression testing like in the migration, where we want to ensure that adding in the second database doesn't change or break the Datastore (as we want both to function simultaneously).

Git

I learnt several useful commands by searching online for solutions to pain points or just from talking to the rest of the team. For instance, I learnt about fetching a PR's code directly from the PR, which helped in reviews.

OSS

I learnt how to better give code reviews, and in particularly how to think about it in a way that made code consistent across the entire application (e.g. the "space before checkbox" issues). I also learnt the importance of presenting feedback across in a clear and conducive way.