Knowledge gained from Projects

CATcher:

DING YUCHEN
KANG SU MIN
LOW JUN KAI, SEAN
PRITHIVI RAJ S/O SIVA KUMAR

MarkBind:

JONAH TAN JUN ZI
KOH RAYSON
ONG WEI XIANG
RYO CHANDRA PUTRA ARMANDA

RepoSense:

CHAN GER HEAN
HSU ZHONG JUN
HUANG CHENGYU
ROLAND YU WENYANG

TEAMMATES:

ADITHYA NARAYAN RANGARAJAN SREENIVASAN
DAO NGOC HIEU
LI JIANHAN
LIM ZI WEI
MO ZONGRAN
TAN CHEE PENG
TAN TEIK JUN
ZHUANG XINJIE

CATcher

DING YUCHEN

Linter and Formatters

Linters are tools that provide static analysis for your code while formatters help to check the style of the code.

There are certain overlaps in the functionality and rule-sets with both linters and formatters, but in general, it will be good to have both as development dependencies in a project.

In the CATcher project (at the current point of writing), we use tslint to enforce certain styles. This is integrated with continuous integration (CI) in order to ensure a certain quality of pull requests.

However, tslint does not fix simple errors that are being caught. In fact, there are certain rules that are not caught at all, even though it is specified in the rule-set. (e.g. indent-level : 2) This is where prettier, a formatter, should come in to enforce consistency of style across the code base.

Furthermore, prettier being an opinionated formatter, means that there isn't much set up to be done.

prettier can be integrated with tslint in a few ways. In general, most projects use a plugin to disable all formatting rules so that they can be handled solely by prettier. prettier can also be added as a linting rule and which defers to the linter to fix the formatting errors. However, there is usually a lot of squiggly-underlines that cause a lot of distraction and speed of formatting is a little slower.

In this project, I decided to apply prettier as a pre-commit hook so that any new changes are automatically formatted. This also somewhat decouples the formatter from the linter. The formatting rules are still checked by the linter, but the formatter will fix them, as prettier doesn't throw errors.

It is worth noting that tslint has been deprecated and future projects should look into using eslint for their linting requirements.

Gitignore Pattern Format

While it is common to see the regular ignore patterns and wildcards in a gitignore file, gitignore also comes with a negation pattern that allows for exceptions.

The exceptions are specified using the ! prefix, and files that start with the ! literal can be escaped using backslash \.

The most common usecase for would be something like ignoring all files and directories except a specific folder.

However, it is worthy to note that it is impossible to "unignore" a file whose parent folder have been ignored. This is due to performance reasons.

Therefore a gitignore would look something like this:

# ignore everything except /foo/bar
/*
!/foo
/foo/*
!/bar

prettier in particular, uses the gitignore format in order to choose files that are subject to formatting. Ignore rules reside in a .prettierignore file at the root of the project directory.

GitHub Actions

GitHub Actions are a way to add automation to the repository. Most repositories make use of it to implement continuous integration (CI) and continuous deployment (CD).

One of the interesting things that I noticed about GitHub actions is that all commands are akin to some form of package, available through the GitHub Marketplace.

Running a GitHub Action simply requires the action name and a version to be specified in the actions.yml file.

Angular

Angular is a web framework for building responsive websites. As someone who has experience with building React Applications, the component-based framework is quite different.

Angular applications have 2 main parts:

Components
Templates

Components define the web component as a Javascript object that stores and update values. These values are then reflected on the website via templating, which is a way to define variable values in an HTML file.

Decorators are Angular's way of defining the relationship between various actors in an application. The @Component() decorator for example specifies the tags that match the component to the HTML element, as well as the style and template.

Template Binding

The templates also allow a way for interactions on the webpage to trigger functions in the component to exact changes.

These come in the form of attributes in the HTML element, such as (click), known as event binding.

Dependency Injection

Unlike React whereby everything is pretty much "hand-traceable", Angular provides certain "code magic" behind the scenes, such as declaring dependencies via dependency injection. This allows us to pass in various objects into the constructor of the component, without caring about initialization.

Dependencies can be declared via commandline, or manually by adding a @Injectable() decorator. The providedIn field tells Angular the scope at which this dependency is accessible.

Observables

Observables are Angular's way of passing information between parts of the application, and also used for event handling and data output.

The observer pattern is a software design pattern in which an object, called the subject, maintains a list of its dependents, called observers, and notifies them automatically of state changes.

In this sense, it is similar to the case of Streams in Java. Observables are not executed until subscribed, and for each subscriber, the observable starts a new execution. Therefore, we may consider this execution to be pure. RxJS also provides a way for subscribers to receive the same values while an execution is in progress, via multicast.

Use of observables in the CATcher codebase is prevalent. A common pattern that is used is to compose pipes that define how the data passed from an observable should be transformed, and these pipes are then attached to the subscriber, like how one would use monads.

Jasmine Test Suite

Jasmine is a testing framework used by CATcher for tests.

Jasmine itself is a very lean set of functions to describe the behavior of functions and match it against its output.

Test Structure

Jasmine tests are wrapped in a describe clause (function). The wrapper serves as a way to group similar specifications (specs) together and also allowing for nested definitions.

Each spec consists of a string and the a function to be run as a test. The test comes in the form of expectations, which takes an actual, the output of a function to be tested, chained with matchers, which form a human-readable boolean relation between the actual and the expected result.

One thing to note is that describe and it blocks can be nested indefinitely as required, and the setup and teardown routines behave as per a depth-first-search.

In CATcher, test files are matched with the directory and the filename of file being tested. For example,

src/app/phase-team-response/issues-faulty/issues-faulty.component.ts

will have a matching test file in

tests/app/phase-team-response/issues-faulty/issues-faulty.component.spec.ts

KANG SU MIN

Angular

CATcher is an Angular web application.

Aspects learnt

Angular Structure

Components, services, corresponding html and css files work together to form a cohesive application. While a component is a direct representation of visible parts of an application, a service is more subtle in the sense that it runs in the background to provide services to components where needed. By defining the service in the constructor of a component or another service, the component or service is able to access the methods defined in the service freely. The separation of components and services increases modularity and reusability, as through dependency injection (DI), the service class can provide services to different parts of the application.

RxJS stands for Reactive Extensions for Javascript. It supports reactive programming for Javascript, which allows changes in data to be propagated through the application instantly. Angular makes use of the RxJS library to support asynchronous programming and improve reactivity of an Angular application.

RxJS supports Observables and Observers, allowing Observers to receive updates on changes to the Observable it subscribes to. This implementation is similar to Observables and Observers in other programming langugages such as Java.

Pipes from the RxJS library are used very frequently in CATcher reduce clutter and improve readability of our codebase. It strings together operators in a sequence such that the operators can be applied to the given value in order.

Example of custom operators using pipes in CATcher:

export function assertSessionDataIntegrity() {
   return pipe(
     throwIfFalse(sessionData => sessionData !== undefined,
       () => new Error('Session Data Unavailable')),
     throwIfFalse(isSessionDataCorrectlyDefined,
       () => new Error('Session Data is Incorrectly Defined')),
     throwIfFalse(hasOpenPhases,
       () => new Error('There are no accessible phases.')));
 }

Knowledge Gained:

Enforcing separation of concerns in modules by splitting functions and logic into different components and services
Using observables to enable asynchronous operations in our application
Using pipes to efficiently transform data used in our application

Resources Used & Summary

Angular Guide : Official Angular developer guide and introduction to basic Angular topics
RxJS Guide : Official RxJS guide on Observables, Observers, Operators, Subscription, etc.
Angular Guide on Navigation of Component Tree : Guide on how to navigate the component tree with Dependency Injection

Angular TestBed Utility - Component DOM Testing

Angular's TestBed Utility allows us to set up component tests for different components used in CATcher.

As opposed to our existing tests for testing individual functions within comoponents and services, these component tests allow us to inspect changes in the view of the component (which are caused by HTML changes). Hence, it gives us a way to look into UI-related problems users might face, something that our existing tests have not been able to achieve.

Steps to set up component tests:

Configure the testing module through TestBed function with the corresponding component's settings
Use TestBed function to create the component (fixture) to be tested
Observe HTML changes in the fixture during testing of functions by querying HTML elements of the fixture

Knowledge Gained:

Setting up component tests using Angular TestBed Utility
Inspecting HTML changes using Angular TestBed Utility

Resources Used & Summary

Angular Guide - Basics of testing components : Official Angular developer guide for the basics of component tests
Angular Guide - Component Fixture : Official Angular developer guide on ComponentFixture
Introduction to Unit Testing in Angular : Useful article on how to test component fixtures

Jasmine (Javascript Testing)

CATcher follows the Jasmine Style for writing tests.

Aspects learnt

Test Suite

The very basics of Jasmine testing involves the test suite, which is started by calling the global Jasmine function describe. The test suite may consist of several tests(specs) within itself, which is done by calling it. Coming from Java background, one thing I learnt about Javascript testing was that it is descriptive (as the name describe suggests), which makes it easier to understand the tests.

Spy

One distinctive aspect of Jasmine testing (or other Javascript testing frameworks such as Jest) is the Spy object. It allows users to stub functions and keep track of useful information such as the number of calls to the function, parameters it has been called with, etc., which is very useful for testing different parts of a function under test thoroughly. For example, if function B is called within function A which is under test, the user is able to find out detailed information regarding how function B is called within A by creating a spy object of B. This allows testers to verify that B has indeed been called with the correct arguments correct number of times.

Knowledge Gained:

Writing unit tests for different components and functions in Typescript
Using Spy objects in tests to stub functions and improve quality and efficiency of tests

Resources Used & Summary

Official Jasmine documentation : This is the official Jasmine documentation for Jasmine 3.6
Introduction to Jasmine 2.0 : This is a good summary / introduction of Jasmine test features

Electron

CATcher uses Electron for its Desktop application. Electron is a framework for building cross-platform desktop applications. It has allowed us to build a desktop app off our Angular web application codebase.

Aspects learnt

Electron uses the module ipcMain to communicate to renderer processes from the main process. As an Event Emimtter, it is able to send out and listen for events, allowing inter-process communication within the Electron application (as the name suggests).

Because it is a desktop application, it is important that we account for different operating systems in our code, since different operating systems would behave differently to changes in our application.

Example of adapting application logic to Linux O.S.

  if (isLinuxOs()) {
    // app icon needs to be set manually on Linux platforms
    windowOptions['icon'] = ICON_PATH;
  }

Knowledge Gained:

Understanding behaviours and characterisitics of different operating systems
Adapting application logic to different operating systems

Resources Used & Summary

Official Electron Guide : This is the official Electron documentation

Github Authentication

CATcher uses OAuth 2.0 Protocol to authenticate users. Users are required to log into their Github accounts before they can start using CATcher.

Aspects learnt

Web Application Authorization Flow

The basic flow is as follows:

CATcher opens a separate window that navigates to GitHub for login.
Once Github verifies the user's identity, the user is redirected to CATcher with a temporary authorization code.
CATcher exchanges the code for the access token, which allows CATcher to access Github API within CATcher.

The above authorization process works perfectly fine when it comes to verifying a user's identity and accessing Github API, however there is a security flaw.

This authentication process is vulnerable to cross-site request forgery (CSRF) attacks, which compromises users' privacy and security.

CSRF attack as a broad term refers to an attack that tricks a victim into doing things on a website in which they are currently authenticated by causing the victim to send a malicious request to the website server.

In the case of CATcher specifically, during the authentication process of a user, an attacker can send their own authentication session ID to CATcher before CATcher can receive the actual response for the user from Github. This tricks CATcher into thinking that the user has been authenticated, and allows the user into CATcher, using the attacker's Github account. This means that whatever information the user uploads onto Github through CATcher would instead be uploaded to the attacker's Github account, instead of the user's.

What we can do to prevent this attack is to add the state parameter in our authentication process. In our first step, before navigating to Github, CATcher can generate a random string called state (high entropy needs to be ensured for security) and send it to Github together with other details. Upon authenticating the user, Github will then send back the state parameter to CATcher, and this allows CATcher to check the returned state parameter against the sent state. If the state parameters do not match, it might point at a potential CSRF attack, as Github will definitely return us the correct state. CATcher will then ignore the response and wait until the correct state is received.

Knowledge Gained:

Understanding the authentication process for OAuth 2.0
Improving security of an authentication process by adding state parameter

Resources Used & Summary

Github Docs : Github Documentation on authorizing OAuth applications
OAuth CSRF & the 'state' parameter : A short video on how CSRF attacks can diminish security of an application and how state paramters can prevent CSRF attacks
Cross-site request forgery (CSRF) : An article on how CSRF attacks occur

LOW JUN KAI, SEAN

Github Actions

Aspects

CI / CD

Aspect 1: Setting up of pipelines for various Operating Systems

Summary

With the introduction of Github Actions, it has provided developers a free service in order to build and test their applications. This is important especially with regards to CATcher, where we need to ensure that the builds for Unix, Windows and MaC systems works for usage.

Knowledge Gained

Setting up of pipelines to build and test on Unix, Windows and Mac Systems concurrently

Aspect 2: Deployment to Staging Site via Github Actions

Summary

By using Github Actions, we can enable a separate staging repository to fetch the current master branch of our CATcher-org/CATcher repository and run its own deployment separate from the main CATcher repository

Knowledge Gained

Enabling conditional statements to block workflows from being run
Using Github Actions to allow reposioties to fetch from one another despite having no common upstream.

Resources

The Resources below are to familiarize yourself with how Github Actions works and examples of jobs that you can run.

The Following PRs can serve as a starting point to how Github Actions are used for CI / CD on open-source projects

Angular Test Bed Utility

Aspects

Component Testing

Aspect 1: Setting up of component tests over unit tests

Summary

With the aid of the TestBed Utility given by angular, we can set up component tests such that we can test the usability of the compoentns used in the CATcher application. Not only that, it would also give us great benefit by being able to detect the relavant HTML changes when needed.

Knowledge Gained

Learning how to use the TestBed configuration to set up the desired compoent for testing using custom injections.
Learning how the components interact with the DOm and using state to change the result of the required HTML Elements.

Resources

The resources below are to familiarize yourself with how you can use Angular's TestBed API for Component DOM Testing

Basics to Component DOM Testing

The Following PRs can serve as a starting point to how to use the TestBed Utility in your Angular application.

Assignee Component: Add Tests

Angular RxJS Custom Operators

Aspects

Functional Programming
Code Quality

Summary

In RxJS, we can make use of operators to make our code more readable given multiple steps in a pipeline.

this.getData().pipe(
    map(...), 
    map(...), 
    map(...), 
    map(...)
);

With Reference to the example above, we are actually exposing too many details with regards to this pipeline. This not only makes to code way too lengthy, but also exposes too many details to the user reading this function.

As a result, we can consider the usage of custom operator

function transformToData() {
    return pipe(
        map(rawData => parse(rawData))
    );
}

For example, we can just keep the logic or mapping raw data to actual data within a pipeable operator, which is a pure functiom.

Not only does this help to abstract away implementation details, it helps with regards to testing as we are able to test these individual custom operators. As a result, our final pipeline would look more readable:

this.getData().pipe(
    this.trimSpaces(), 
    this.trimNewLines(),
    this.transformToData(),
    this.transformToNumbers()
);

Knowledge Gained

Learning how to use the custom operators to make certain pipelines more readable.
Learning how use Angular Testbed to test out the various custom operators created.

Resources

The resources below are to familiarize yourself with how you can use custom operators and pipeline operators

The Following PRs can serve as a starting point to how to use and test custom operators in the form of pipeable operators.

PRITHIVI RAJ S/O SIVA KUMAR

Angular Workspace & Build Configurations

Summary

Angular supports the implementation of separate workspaces and build configurations to support a variety of tasks. Typically in a default project, Angular has pre-defined development and production environments, but, with our use case in CATcher, there is a need to differentiate the build instructions for a test environment that we can run E2E and other forms of testing on.

Aspects

Setting up separate environments for different purposes (testing in this case).
Learning about how Angular does file replacements and injects different data from files on compile and runtime for these different configs.
Developing Mock-Services to simulate backend API (Github API) responses and bypass certain actions such as Authentication to allow for an isolated test environment that is injected to the application on runtime.

Sources

Angular Docs
- Workspace Config
- Build Config

Protractor (E2E Testing for Angular)

Summary

Protractor is the go-to E2E testing framework for Angular applications. It is essentially a wrapper around Selenium (a Javascript testing framework that assists automating browser actions by interacting with individual browser drivers). With Protractor, in the context of CATcher, we are now able to simulate the application experience and interface on a variety of web-browsers to better catch potential issues that may arise when users utilize our production application.

Aspects

Learned about the Potractor automation process and tech stack, specifically how it is built ontop of Selenium which then uses tools like Jasmine and Browser Drivers to spin up a local browser to simulate E2E testing.
Configuring Angular to run a specific test configuration which provides Stubs for services that communicate with the backend such that E2E testing can be conducted in an isolated environment.
Understanding the usage structure of Protractor, specifically how it directly interacts with HTML Elements based on a variety of Selectors through the browser to perform user actions during E2E testing.
Experience modifying Web-Elements such that they are compatible with multiple browser drivers to provide a stable testing framework

Sources

Youtube
Potractor Docs

Github Actions

Summary

My contributions to the CI/CD pipeline was the implementation of E2E testing carried out on Pull Requests to ensure Regression. This is especially necessary as we move to formalize the web-platform as a default and begin code-quality improvements and feature implementations. Adding E2E testing to our pipeline was also the more efficient method as opposed to having it added to the pre-push hooks that we currently use for linting. This is because not only do we save time on each push, as E2E tests are typically long due to the need to spin up the necessary drivers and browsers, but also ensures that tests are constantly verified on standardized operating systems with the latest drivers for compatibility.

Aspects

Learning of the various tools, virtual environments and configurations provided by the Github Actions Team.
Setting up Virtual Environments to run E2E Testing as a part of CI when PRs are made to the main repository.
Configuring Virtual Environments and performing manual driver configurations to add support for headless E2E tests for multiple browsers.
Learning to parallelize E2E testing interface to reduce time taken during our CI/CD tests prior to PR approval.

Sources

StackOverflow (Primarily)
actions/virtual-environments Repo

Probot (Github Bot)

Summary

Probot is a node tool built in Typescript and the Octokit framework to create a customizable Github Bot that can manage repositories and perform complex tasks with minimal lines of code when compared to plain Github Actions. Probot uses a webhook based response where the application receives prompts from the Github API when any action takes place in the repository and reacts to different types of hooks differently.

Aspects

Configuring a custom Github App to listen to repository webhook
Setting up automated labelling, PR commit message verification, etc...

Sources

Probot Github

MarkBind

JONAH TAN JUN ZI

Vue.js Provide / Inject

This was introduced to me while implementing submenu support for dropdown. This is useful as dropdown can have multiple levels of submenus, creating deeply nested components.

Usually, when we need to pass data from the parent to child component, we use props. Imagine the structure where you have some deeply nested components and you only need something from the parent component in the deep nested child. In this case, you still need to pass the prop down the whole component chain which might be annoying.

For such cases, we can use the provide and inject pair. Parent components can serve as dependency provider for all its children, regardless how deep the component hierarchy is. This feature works on two parts: parent component has a provide option to provide data and child component has an inject option to start using this data.

Using provide and inject allows us to more safely keep developing child components, without fear that we might change/remove something that a child component is relying on. The interface between these components remains clearly defined, just as with props.

In fact, you can think of dependency injection as sort of “long-range props”, except:

parent components don’t need to know which descendants use the properties it provides
child components don’t need to know where injected properties are coming from

Official documentation of provide/inject

Event publisher / subscriber

The publisher/subscriber pattern is a design pattern that allows us to create powerful dynamic applications with modules that can communicate with each other without being directly dependent on each other. Similar to the oberver pattern, subscribers are like observers and consumers are like subjects. However, the main difference is that in the observer pattern, an observer is notified directly by its subject, whereas in the publisher/subscriber method, the subscriber is notified through a channel that sits in-between the publisher and subscriber that relays the messages back and forth.

Implementation of publisher/subcriber in Javascript can be found here

This was introduced to me when I was working on the navigation menus for mobile site. I had to inform sibling components to close their overlays when a specific event occured. Passing this information to the parent and back down to the each of the sibling component will make it more verbose as well as increase coupling. By using the publisher/subscriber, each of the component can subscribe to a specific event. When a component needs to close the sibling overlays, the component can simply publish the event, which all subscribers of the event will be informed.

Resize Observer Web API

This was discovered while working on the navigation menus for mobile. Since the navigation menus only show up on smaller screens and hide on larger screens, this means that the lower navbar can be toggled to show/hide respectively. This will cause the height of the header to change. Since there are a few CSS rules that use the height of the header, I had to dynamically adjust these CSS rules when the height changes. This also meant that I had to find a way to observe the height of the header and perform some methods when the height changes.

The ResizeOberver web API reports changes to the dimensions of an Element's content or border box, or the bounding box of an SVGElement. ResizeObserver avoids infinite callback loops and cyclic dependencies that are often created when resizing via a callback function. It does this by only processing elements deeper in the DOM in subsequent frames.

Implementation details of ResizeObserver can be found here

MarkBind's Retriever.vue for cloning reactive content

I was introduced to this specific vue component while working on the site and page navigation menus for devices with smaller screens. As the menus for the smaller screens are different vue components from the original site and page navigations, I had to pull the content from the original menus to the newly created components. I was initially searching through the DOM to find the menus, and subsequently copying these nodes, using appendChild, into the new menus. However, this was not the best solution as reactive content will be lost if it was duplicated in this manner. To ensure consistency of the content of the navigation menus on both desktop and mobile version, content can be cloned using the Retriever.vue component to ensure the reactivity of the content.

Refer to Overlay.vue and NestedPanel.vue to see examples of how Retriever.vue can be implemented.

Internal implemenentation of Retriever.vue can be found here

HTMLElement: transitionend event

This was brought to my attention while I was working on redesigning the navigation bar. As the navigation bar takes up a fair amount of space vertically, on smaller devices, this leaves lesser space for the main content. A solution to this issue was to hide the navigation bar when the user scrolls down so that they will have more space for the main content and unhide the navigation bar when the user scrolls up so that it can be easily accessed. This was acheived by toggling the overflow and max-height of the navigation bar. To maintain a consistent transition effect, I had to make sure that the overflow is only toggled when the transition has completed. I was initially using setTimeout to match the transition duration to toggle the overflow. This was generally not a good way to do this as the timing of events is not guaranteed, especially across browsers or devices. To ensure that the overflow is only toggled when the transition has ended, we could use the transitionend event.

The transitionend event is fired in both directions - as it finishes transitioning to the transitioned state, and when it fully reverts to the default or non-transitioned state. If there is no transition delay or duration, if both are 0s or neither is declared, there is no transition, and none of the transition events are fired.

Additionally, there are also similar transition events, transitionrun, transitionstart, transitioncancel, to track the different transition states.

Implementation details of transitionedend event can be found here

Bootstrap: Display property

This was introduced to me while I was working on a PR related to the tab component. Prior to this PR, when a MarkBind page was printed/saved as PDF, only the active tab was displayed. The purpose of this PR was to show all the tabs when a page is being printed/saved as PDF. This requires setting certain elements of the tab to be displayed (the inactive tabs etc) and some to hidden (the original headers etc). I decided to use the @media print CSS rule to set those elements to display: block and display: none respectively.

However, MarkBind uses Bootstrap which comes with a built-in display property feature which allows users to quickly and responsively toggle the display value of components. This includes support for some of the more common values, as well as some extras for controlling display when printing. This can be done by using the display utility classes.

The classes are named using the format:

.d-{value} for xs
.d-{breakpoint}-{value} for sm, md, lg, and xl.

For the PR's use-case, elements can be assigned the classes .d-print-{value} to change the display value of elements when printing. For instance, .d-print-none can be used to hide elements when printing and .d-print-block can be used to show elements when printing. Classes can be combined for various effects as you need.

More examples:

<div class="d-print-none">Screen Only (Hide on print only)</div>
<div class="d-none d-print-block">Print Only (Hide on screen only)</div>
<div class="d-none d-lg-block d-print-block">Hide up to large on screen, but always show on print</div>

Full implementation details of display property can be found here

Lodash: `_.differenceWith` method

This was discovered when I was working on a PR related to the live preview of the pages attribute in site.json. I needed to check if the pages attribute has been modified, and if so, which are the modified pages (including pages that are added and removed). Since the pages attribute is read in as an array of objects, I needed a way to check each individual object to see if one or more of their properties have changed. I was initially aware of lodash's _.difference method which returns an array of array values not included in the other given arrays. However, this is not sufficient to detect changes to specific properties of an object. After looking around, I found another lodash's method, _.differenceWith, which is similar to _.difference except that it accepts comparator which is invoked to compare elements of array to another array. With the comparator, this method can now look for changes to specific properties of objects.

For instance, the below code block will show how _.differenceWith can be used to track the added and removed pages using a single comparator, isNewPage.

const isNewPage = (newPage, oldPage) => _.isEqual(newPage, oldPage) || newPage.src === oldPage.src;

const addedPages = _.differenceWith(this.addressablePages, oldAddressablePages, isNewPage);
const removedPages = _.differenceWith(oldAddressablePages, this.addressablePages, isNewPage)

Another example to show how _.differenceWith can be used with a more specific comparator to find edited pages.

const editedPages = _.differenceWith(this.addressablePages, oldAddressablePages, (newPage, oldPage) => {
    if (!_.isEqual(newPage, oldPage)) {
        return !oldPagesSrc.includes(newPage.src);
    }
    return true;
});

With this method, I am able to quickly identify differences in any objects in an array. This method is also flexible as I can declare my own comparator.

Current implementation of _.differenceWith can be found in the handlePageReload method in index.js of Site.

Full documentation of Lodash's _.differenceWith can be found here

KOH RAYSON

Tool 1: Github Actions

Aspect 1: Exposing secrets to workflows triggered from a forked repo

Due to security concerns, Github Actions does not expose repo secrets to workflows which are triggered from a forked repo. However, there may be actual use-cases where certain automated workflows, which require certain repo secrets, may be useful such as PR previews, automated labelling of PRs, generate code coverage report, etc.

Github Actions provide the pull_request_target and workflow_run to cater for such use-cases. To summarize, pull_request_target runs the trusted workflow code in your actual repo (not the forked repo), and as such, it is able to expose the repo secrets during the workflow run. On the other hand, workflow_run can be triggered after a prior workflow run has completed and it has access to the repo secrets. The ideal use-case that Github recommends for workflow_run is for the prior workflow to generate some A file or collection of files produced during a workflow runartifacts, and for the privileged workflow to take the generated artifacts, do some analysis and post some comments on the pull request.

Some useful resources:

Github actions provide the abilitiy for a task to communicate with the runner machine to set environmental variables and output values that other tasks can reference, via workflow commands. This might be useful for segmenting the workflow up into multiple steps, to improve code readability and maintainability.

Some useful resources:

Aspect 3: Github Tokens

For Github Actions, there is a default GITHUB_TOKEN secret that the user can use to authenticate in a workflow run. The token expires after the job is finished. With this, one doesn't need to explicitly generate Personal Access Token for workflows which require write repo permissions.

Some useful resources:

Tool 2: Jest

Aspect 1: Testing using data that are structurally similar

It may be quite repetitive to write tests for data which have a similar structure. One example is unit testing an add(x, y) function. You may wish to use the following testcases: {0,0}, {-1,-3}, {1000000000,1000000000}. Jest offers a feature to simplify testing for such cases where the data share a similar structure using the .each operator. In the same example, the tests can be simplified to as follows:

test.each([
  [0, 0, 0],
  [-1, -3, -4],
  [1000000000, 1000000000, 2000000000],
])('.add(%i, %i)', (a, b, expected) => {
  expect(a + b).toBe(expected);
});

Some useful resources:

Aspect 2: Testing specific files or functions

For testing purposes, sometimes it might be more efficient to drill down to a single test-file, rather than running the whole test-suite. Jest offers this feature where you can simply type npm test -- <PATH_TO_FILE> and it will run the test only on that specific file.

To test a specific method in the test file, you have to set a flag so that Jest knows to skip all other unrelated functions. This can be done using the .only operator.

Some useful resources:

Tool 3: Markbind

Aspect 1: Testing CI scripts during development

There are not a lot of resources online regarding the testing CI scripts from a forked repo during development, especially for a mono-repo such as Markbind. The following is a general guideline for a CI build script that allows you to test your code changes:

Navigate out of the current directory and git clone your forked repo
Navigate into the your cloned repo and run the setup instructions in your repo
Navigate into the original directory - most CI platforms provide an environment variable for this (eg. Travis-CI provides TRAVIS_BUILD_DIR)
Run the deploy steps

Some useful resources:

Atlassian Tutorial - What is a monorepo?
Stackoverflow - How to install NPM package from Github directly - A far simpler method that works for npm package repos which are not a monorepo

Aspect 2: Useful CI environmental variables

Implementation of markbind deploy --ci uses some useful CI environmental envariables to extract information such as:

Which CI platform is the code being run on
Getting the repo slug in the form owner_name/repo_name

Some useful resources:

Snippet of Codecov's bash script which leverages heavily on CI environmental variables - Credit to HSU ZHONG JUN for suggesting the resource in Support more CI platforms for markbind deploy #1432.

Aspect 3: Regex

Markbind relies on some regex expressions for parsing. It would be good to have a basic understanding of regex to understand some of the parsing-related functions.

Some useful resources:

ONG WEI XIANG

Intersection Observer API

This web API was brought to my attention by my mentor when I was implementing the Disqus plugin for MarkBind.

Disqus comments are typically positioned at the bottom of the page. Since it can be expensive to load the comments, they should only be loaded when necessary — that is when users want to read them. Therefore, it is appropriate to introduce lazy-loading here, so that comments are only loaded when the user scrolls to the comment section.

However, implementing a solution to detect whether a user has scrolled to a particular section can be difficult and "messy". This is where the Intesection Observer API comes in handy. One of the main motivations for this API was to tackle the lazy-loading problem; by providing a way to detect when a target element comes into the viewport.

Sources

Mozilla usually have very comprehensive and well-written documentation for web APIs. This is the case for Intersection Observer API as well.

Another good resource that I would recommend to learn about this API is an introductory video by Kevin Powell.

Markdown Parsers

Markdown is one of the core components of MarkBind. As I was working on the markdown feature to enable ++ for larger text, -- for smaller text, and $$ for underlining, I decided to research more about the markdown parsers in general.

As a start, I was curious as to why we chose markdown-it over other alternatives. This led to the question: what could be some of the features that we need from a markdown parser? I felt that this short article gave a pretty good explanation of the various features of a markdown parser and why we may we want to choose on library over another, based on the features we need. On top of that, this summary and quick comparison of the well-known markdown parsers available gave me a glimpse as to why we decided on markdown-it.

Even though marked is considerably more popular than markdown-it based on download count, markdown-it actually offers more features to users such as a better curation of community-written plugins. More importantly, markdown-it has good adherence to the CommonMark specification, which gives us a standard, unambiguous syntax specification for Markdown. Here is a specific instance where adherence to CommonMark specification has benefitted us in making syntax decisions for MarkBind. Furthermore, the API documentation for markdown-it seems to be more thorough and well-written than marked. This leads me to believe that perhaps marked is a better choice for lightweight usage while markdown-it is more suitable for users who require more complex features (which can be found in the form of plugins).

What I have written so far is based on my brief insight into the available markdown parsers. There are a lot more details and nuances that I have yet to look into like — markdown parsers are actually quite complex! I will continue to update this space when I find out more about markdown parsers.

Why you should not use `setTimeout` to resolve asynchronous bugs (even when it seems to work)

Whenever I suspect that I am looking at an asynchronous bug, I would usually use setTimeout to help me ascertain whether the bug is asynchronous in nature. Not knowing better, I used to employ it as a way to execute a function after a certain asset is loaded. However, after hearing Philip Roberts' explanation on Javascript's event loop and learning more about it, I know to never do it again. If Javascript's event loop sounds foreign to you, I highly recommend that you watch the video! It is arguably one of the most popular videos that explains the concept in a manner that is easily understandable with really good visual examples.

I learnt that setTimeout(functionToBeExecuted, 0) doesn't actually execute functionToBeExecuted after 0 millisecond — it only tries to execute the function after 0 millisecond, provided the call stack is empty. If there are other frames on the call stack, then it would have to wait until the frames are resolved before the callback function, functionToBeExecuted, is pushed on the stack for execution.

Here's the main problem: setTimeout is non-deterministic. There isn't any guarantee on when the callback function would execute. Thus, it's not a good idea to rely on this non-deterministic scheduling to execute your function at a specific time.

Suppose you wish to use setTimeout to schedule a callback after a certain asset is loaded, you would have to "estimate" when this asset would be loaded. And this estimation is no more than a gamble when we have to take the speed of the user's network and/or device into consideration. Thus, this is never a reliable way to execute your callback function at a specific time.

Instead, a better solution, perhaps, would be to look for an appropriate event hook such as DOMContentLoaded to ensure that your function executes at the specific time that you want.

Hot-reload vs. Live-reload

This topic confused me for quite awhile when I started working on MarkBind. However, all my doubts were cleared when I read up on an article recommended by my mentor. Let me explain these concepts in the context of MarkBind.

Hot-reload

hot-reload is a feature that is enabled when MarkBind is served in development mode. And it is especially useful in helping to boost our productivity as developers. When served in development mode, hot-reload will be triggered when there are any changes to the source files that are bundled by webpack. Since our source files in core-web and vue-component are bundled by webpack, hot-reload will be triggered if we were to make any changes to the source files in core-web or vue-components.

During runtime, hot-reload patches the changes that we have made in our source files to the files that are served in the browser, without having to reload the browser. Thus, we do not lose any state which is really helpful when we are tweaking the UI.

Note that hot-reload is entirely enabled by webpack, specifically webpack-hot-middleware and webpack-dev-middleware. As the names suggest, these are middlewares that are injected into our development server, live-server, so that the hot-reload capability is enabled for our source files.

Live-reload

live-reload, on the other hand, will reload the browser whenever changes to the source files are detected. In live-reload, we are looking at a different set of source files (e.g. md, mbdf) when compared to hot-reload. Let me go through briefly how live-reload works in MarkBind currently.

We make use of chokidar, a file watcher, to watch the source files for any changes made by the author. Upon detection of addition / deletion / modification of the source files, chokidar will execute the callback functions passed into it. The callback functions that we pass into chokidar will trigger the rebuild of the source files and output the built files into the designated output folder. Once live-server detects updates in the output folder, it will trigger a browser reload to update the files on the browser.

Use-cases

Essentially, you can see hot-reload as a feature that is meant for MarkBind developers (although the live-reload feature is also useful to us for testing purposes), and live-reload as a feature that is targeted at authors who use MarkBind to aid their write-up process.

Server-Side Rendering (SSR) using Vue

Implementing SSR for MarkBind is definitely the highlight of my learning and contributions this semester. In the following sections, I will be giving a summary of what I have learnt and implemented for MarkBind with regards to SSR. Note that I will only be discussing the main ideas of SSR.

What is SSR?

To understand what SSR entails, it is first important to understand what Client-Side Rendering (CSR) is. You probably would have seen such terminologies before if learnt about web development in general. But what are they really? And why do we need them? Let me kindly refer you to this article, Rendering on the Web. This article has a really good write-up on the various rendering options, their motivations, and the difference between them.

Why should MarkBind adopt SSR?

Our motivation to introduce SSR in MarkBind is to tackle the problem of flash-of-unstyled-content (FOUC). The reason as to why FOUC occurs in the first place is due to MarkBind's usage of CSR.

Let me give an example of how CSR causes FOUC. When we send our HTML page over to the client's browser, a portion of our HTML page can look something like this:

<box>
	example	
</box>

Note that <box> is a Vue component of MarkBind that needs to be compiled and rendered by Vue so that it can be transformed into proper HTML. After compilation and rendering by Vue, it may look something like this:

<div data-v-7d17494f="" class="alert box-container alert-default">
	<div data-v-7d17494f="" class="box-body-wrapper">
		<div data-v-7d17494f="" class="contents">
			example	
		</div>	
	</div>
</div>

As you can see, there is a drastic difference in the HTML markup. Most importantly, the classes, which our CSS applies styling to, only appears after Vue compiles and renders the <box> markup.

Notice that the inner content of the <box> markup, example, would still be rendered by the browser both scenarios. This means that during the time Vue compiles and renders the <box> markup, there will be no styling applied to example. Thus, we face a problem of FOUC here.

A straightforward and proper way to resolve this is to use SSR instead of CSR; by compiling and rendering the HTML markup on the server before we send the HTML page to the client.

How can SSR be implemented?

Implementing SSR is not trivial. Beyond just implementing SSR, most of the time you would have to tweak your implementation to fit the setup that you are working with. Therefore, your implementation of SSR may differ in one way or another when compared to the SSR examples that you may find online.

Here are some good resources to get started with SSR:

Official Vue SSR Guide
Vue & SSR: The best practices — This is a talk conducted by Sebastien Chopin, where he brings you through the various stages of implementing SSR. The code for his implementation of SSR in various stages can be found here.
VuePress — Created by the core development team of Vue, this is a static site generator that uses SSR behind the scenes. It provides a good reference for us to see how SSR can be implemented and how we can modify their implementation to fit our setup.
Vue Hackernews

I believe that by going through the resources above, you will have a good idea of how SSR is generally implemented in "classical" setups.

Implementing SSR for MarkBind

After gaining a level of understanding of how SSR can be implemented in "classical" setups, we will now look at how SSR can be implemented in MarkBind.

Pre-requisites

Understanding how Vue works

Before we start looking at how to implement SSR for MarkBind, it is important to first have a sound understanding of how Vue works. Some foundational questions that you may want to ask are:

What is a Vue instance?
What does it mean to compile Vue?
What are render functions?
Are there any differences between compiling Vue on client-side versus server-side?
What is the difference between compiling and rendering?

Generally, I think that the official Vue documentation gives a decent explanation to the questions above. However, on top of studying the official documentation, I would also highly recommend watching this talk by the founder of Vue, Evan You - Inside Vue Components - Laracon EU 2017, where he explained the inner-workings of Vue.

Understanding MarkBind's Architecture

It is important that we first understand how packages MarkBind interact with one another and the role each of them play in the entire MarkBind setup, this is because all of them will be affected to various extents if we were to implement SSR in MarkBind; in my opinion, SSR is considered an architectural feature, after all.

There are four packages in MarkBind's codebase:

cli
core
core-web
vue-components

To give a better context of how these packages work in tandem to SSR, I will provide a short explanation below. However, if you are interested, you may read about the project structure here.

vue-components is where we house all of our component assets like <box> and <panel> which users can use to enhance the appearance / behaviour of their pages. Essentially, vue-components is packaged as a Vue plugin which we will install on Vue on the client-side, so that HTML tags like <box> can be transformed by Vue into proper HTML.

core-web contains the client-side script assets that we will send to the user's browser so that Javascript can be executed on the HTML markup. This package is also where Vue will install the vue-components plugin, create the Vue instance for each page, and then compile and render the page into proper HTML.

Note that all of the compilation and rendering so far happens on the client-side — what we are doing is client-side rendering.

If you have watched this video, Evan You - Inside Vue Components - Laracon EU 2017, that I have highlighted above, and have some understanding of MarkBind's codebase in general, you should also understand the following:

Each page in MarkBind represents a Vue instance / application
Each page in MarkBind is essentially a template which Vue has to compile into a Javascript render function (the actual process is actually [template —> AST —> render function], but we will skip the explanation for AST for brevity sake)
The render function returns the Virtual DOM of the MarkBind page
The Virtual DOM helps to generate the Actual DOM (which is the HTML mark-up that is achieved after Vue is done compiling and rendering the page)

Note: By creating a Vue instance for the page on the client-side (without passing in a render function as argument), the compilation of the page into render function and transformation of the HTML markup is done behind-the-scenes by Vue. When implementing SSR, however, we must split up the steps of compilation and rendering so that we have more control over the process.

Integrating SSR into MarkBind's setup

After understanding how Vue works and how MarkBind's packages are related, we then look at our MarkBind setup to see how we can integrate SSR into the setup. In essence, what we want to achieve here is to "shift" the client-side rendering onto our server, which is in the core library.

To achieve that, we can break down the process into three steps: 1) compiling each page into Vue render function, 2) installing vue-components plugin on Vue, 3) instantiating a Vue instance with the render function to output the final HTML markup strings.

1) Compiling each page into Vue `render` function

In step one, we are looking to compile each page into the render function on the server, instead of on the browser. It is important to understand that this render function is representative of our final Vue application for a particular page. Thus, before we compile the page into Vue render function, it is imperative that we finalize all the necessary DOM modifications we need to make for the page. There should be no DOM modifications going forward.

Compiling each page into a render function is straightforward. We simply use vue-template-compiler to help us do that. Below is an illustration of how it works.

  // Compile Vue Page
  const compiledVuePage = VueCompiler.compileToFunctions(content);

  // Set up script content
  const outputContent = `
    var pageVueRenderFn = ${compiledVuePage.render}; // This is the render function!
    var pageVueStaticRenderFns = [${compiledVuePage.staticRenderFns}]; // This is the render function!
  `;

This then brings us to the concept of universal application in SSR. Basically, in our context, this means that our Vue page application on the client-side and the server-side must be the same. Notice that what we are doing so far in step one helps us to achieve that exactly; as we pass in the same render function, which represents our page, when instantiating the Vue instance on both client-side and server-side.

On top of passing in the same render function, we have to also remember to pass in the exact same state data when initializing Vue instance. This is because if we pass in different states into the Vue instance on client-side vs. server-side, this may cause the final HTML mark-up between the client-side and server-side to be different, causing us to have "different" applications on client-side and server-side. This is problematic as it causes what is known as hydration issue, which will be elaborated in a later section.

2) Installing `vue-components` plugin on Vue

As mentioned, on the server-side, the render function that we have obtained in step one will have to be passed in when instantiating the Vue instance. However, before we instantiate the Vue instance, we have to first install the vue-components plugin on Vue, so that the Vue instance will be able to render our page Vue application into proper HTML strings (e.g. <box> component can be rendered into the appropriate content that we have wrote for it in vue-components package).

In our old CSR setup, we did not have a way to bring this vue-components plugin into the core package. Thus, we have to look at our webpack configuration in core-web and modify it so that we are able to bring that plugin into our core package, to be able to conduct SSR.

Let me explain the general changes we have to make to the webpack configuration.

In our webpack configuration, we have to define what is known as client-entry and server-entry. The former is the bundle that we will be sending to the client and the latter is the bundle which we will need to use in the server for SSR.

client-entry involves:

core-web client-side (browser) scripts
vue-components plugin

server-entry involves:

vue-components plugin

client-entry is essentially what we had in the old CSR setup (no changes). The only change here is adding a new server-entry, which bundles just the vue-components package that is needed for SSR.

The code structure above is inspired by Vue's official documentation for SSR.

3) Instantiating a Vue instance with the `render` function to output the final HTML markup strings

In the third step, we want to finally execute SSR by using the render function to help us output the final HTML markup strings that we want to send to the client.

As mentioned, the render function that we have obtained in step one will have to be passed into the Vue instance on both the client-side and server-side.

In CSR, we would typically pass in the raw page template string as the argument instead of the render function. But this means that Vue will execute compilation on the page template string behind-the-scenes for us. Since we have already compiled our page into Vue render function in the first place, we will simply pass in the render function when instantiating Vue instance and avoid compiling our page again.

Here is a simple illustration of step three.

  const { MarkBindVue, appFactory } = bundle;

  const LocalVue = createLocalVue();
  LocalVue.use(MarkBindVue); // Install vue-components plugin! 

  // Create Vue instance!
  const VueAppPage = new LocalVue({
    render(createElement) {
      return compiledVuePage.render.call(this, createElement); // This is the render function!
    },
    staticRenderFns: compiledVuePage.staticRenderFns, // This is the render function!
    ...appFactory(), // Pass in state data!
  });

  let renderedContent = await renderToString(VueAppPage); // This is the SSR output!

Let me explain more about appFactory in the code snippet above. appFactory actually comes from VueCommonAppFactory, which is one of the crucial components in our SSR setup. As mentioned earlier, both client-side and server-side will have to create their respective Vue instances. However, those two Vue instances have to represent the same Vue page application; meaning that render function, as well as the state data between them, have to be the same. This helps us to achieve a universal application so that we don't run into hydration issues, which will break our SSR implementation.

Client-side Hydration in SSR

Hydration is one of the biggest challenges you can face in implementing SSR for any setup.

According to Vue, Hydration refers to the client-side process during which Vue takes over the static SSR HTML sent by the server and turns it into dynamic DOM that can react to client-side data changes.

During the hydration process, Vue essentially diffs your SSR HTML markup against the virtual DOM generated by the render function on the client-side. If any difference is found, meaning that the application that we have on the client-side (the virtual DOM tree) differs from the SSR HTML mark-up that we send to the client, Vue will reject the SSR HTML output, bail client-side hydration, and execute full client-side rendering.

This means that our effort in implementing SSR will be in vain; which is why I kept emphasizing on ensuring a proper universal application in the previous sections. Theoretically, it isn't difficult to maintain a universal application per-se, as long as you ensure two things:

the state data are the same between client-side and server-side,
after compiling and rendering the Vue page application, the SSR HTML mark-up is not modified.

However, there are also other scenarios that can cause hydration issues, which will be explained in the following sections. In view of that, it is important that we nail down the universal application aspect of SSR first, before we deal with the other scenarios that can cause hydration issues.

Penalties of Bailing Client-side Hydration

When hydration fails, on top of the wasted time and effort in executing SSR, we will also incur the additional time penalty of executing client-side hydration (where CSR will follow afterwards).

Fortunately, even if we face hydration issues and execute full CSR, the FOUC problem will still be resolved nonetheless. The reason for this is because the SSR HTML markup should resemble the CSR HTML markup to a large extent.

Supposedly, hydration issues typically occurs due to minor differences between client-side rendered virtual DOM tree and the server-rendered content. Of course, this is assuming that we are adhering to the universal application concept as much as possible.

Types of Hydration Issues

The hydration issues that I have encountered when implementing SSR for MarkBind so far can be separated into two categories:

Violation of HTML spec (e.g. having block-level elements within <p> tag)
Modifying the SSR HTML mark-up after compiling and rendering the page

The second category is easy to resolve — we just have to avoid modifying the SSR HTML markup!

The first category was challenging, however. Typically, if you were to write raw HTML from scratch, you will be warned whenever you nest block-level element within a <p> tag. However, this isn't the case when we use Vue/Javascript to generate our DOM. The browser will not raise any errors nor complain about it.

Thus, it is imperative that we ensure the HTML markup that we generate is compliant with the HTML content models specification as much as possible (e.g. do not have <div> nested within <span>).

Dealing with Hydration Issues

In this section, I will give some advice as to how you can resolve hydration issues in MarkBind. Remember to always refer to the developer console of the browser that you are using as that is where the hydration error messages will be logged.

Using Development Vue

The first important step in resolving hydration issues is to ensure that you are using development version of Vue and not the production version.

This is because only development Vue can report the exact mismatching nodes that are causing the hydration issues. Production Vue will only log a generic "Failed to execute 'appendChild' on 'Node': This node type does not support this method" error message that doesn't help you in finding out where/what the hydration issue is. Thus, it is nearly impossible to resolve hydration issues without the hydration error logs provided by development Vue.

Furthermore, when I was dealing with hydration issues, I even noticed that there are cases where production Vue did not warn about hydration issues even when they occured (silent failures). Thus, remember to always use development Vue when trying to deal with hydration issues.

Short-Circuiting Nature of Hydration Process

Note that the hydration process is "short-circuiting" in nature. The assertion process of hydration will go in a top-down manner. Once it encounters a mismatch of nodes, it will immediately bail hydration and execute client-side rendering. This also means that there can be potentially more hydration issues in the later part of the document.

Understanding the Hydration Error Log

Here is an example of how the hydration error log may look like in development Vue:

! Parent: >div

! Mismatching childNodes vs VNodes:
  NodeList(62) [...]
  VNodes(58) [...]

! [Vue Warn]: The client-side rendered virtual DOM tree is not matching server-rendered content. This is likely caused by incorrect HTML markup, for example nesting block-level elements inside <p>, or missing <tbody>. Bailing hydration and performing full client-side render.

When you first look at the numbers, it may come across as alarming that there are 62 nodes that are mismatching. However, that is not exactly the case. It simply means that in the parent node <div>, the SSR HTML markup has 62 child nodes and the in the virtual DOM of the client-side, there are 58 VNode children.

From my experience, most, if not all, of the hydration issues I faced in MarkBind are similar to this particular error log, where the problem is due to nesting block-level elements inside <p>. And this problem stems from an existing bug #958.

Going forward, there should not be such hydration issues anymore. But if there is ever a case, you may want to take a look at #958 to see if the hydration issue is somehow related to it.

Debugging Hydration Issues

Trying to pinpoint the cause of the hydration issues in your source file can be a very tedious and manual process. Taking into account the top-down assertion process of hydration, it may be a good idea to start from the top of the source file and try to isolate which part of the source file is causing hydration issues. The "binary search" approach can also a great way to isolate the source syntax that is causing the hydration issue.

Most importantly, regardless of which debugging approach you undertake, I think we should always seek to isolate the part of the source file which we suspect is causing hydration issue. When dealing with a large document, it is always a good idea to take suspicious part of the document out and test it separately, to see if there is any hydration issues.

Debugging hydration issues is not easy but here are some resources that really helped me in rectifying the hydration issues when implementing SSR on MarkBind:

Other Challenges Faced when Implementing SSR for MarkBind

Implementing SSR for MarkBind also introduced quite a few challenges along the way. I will not delve too much into the details of these challenges but here are some of them:

Learning more about how Vue works
Learning about webpack and how it is currently used in MarkBind
Understanding MarkBind's architecture and analyzing how SSR can be integrated into the setup
Setting up SSR for MarkBind in development mode
Ensuring that the SSR implementation works in all modes
Deciding how the automated tests would be affected with SSR in place (we decided to retain the HTML output that is not server-rendered)
Resolving CI issues where the tests run fine locally but not on certain CI platforms (e.g. setting Node environment variables; cross-env really helped a lot here)
Resolving hydration issues due to existing bugs in MarkBind (#1548)[https://github.com/MarkBind/markbind/pull/1548]

In my opinion, there aren't many resources out there for Vue SSR as compared to React SSR. Thus, if the resources for Vue SSR are not sufficient, you may want to reference some of the React SSR resources; the principles for both should be somewhat similar.

Final Thoughts

Working on SSR required a lot of research and learning across the latter half of the semester. Beyond that, there were also quite a few implementation challenges along the way; where existing bugs were discovered in the codebase that gave me a harder time in implementing SSR. Some of these bugs even threw me off and misled me into thinking that there were some errors in my implementation.

There were quite a few moments where such problems got me scratching my head to figure out what was going on. This got me to realise the importance of having the ability to identify and diagnose problems accurately and that sometimes this only comes with experience (a.k.a learning it the hard way).

In addition, code reviews with peers and senior contributors have never failed to surprise me. It is during these code reviews that I always found out the hard way that there were much easier and more elegant ways to solve the problem than the approach I came up with. Despite thinking that I could have saved so much time and finished the task earlier, I soon realized that it was not the point. The learning from the review process, in my opinion, really was the main takeaway from the module.

At the end of the day, even though the cost-to-reward ratio for implementing SSR doesn't seem to be the best (where so much work is done just to resolve the FOUC issue), we also have to consider that in working on this PR, we also managed to iron out other existing important issues like #1548. Furthermore, it was a good opportunity that allowed me to learn about MarkBind's architecture and the associated technologies in greater detail. All in all, I think that the time and effort spent on this was well worth it.

RYO CHANDRA PUTRA ARMANDA

`htmlparser2`: DOM Traversal, Manipulation, and Things to Watch Out

MarkBind uses a combination of htmlparser2 and cheerio for operations with the HTML representation before finalizing the page to be served, the former is for parsing HTML strings into pseudo-DOM nodes, and the latter is for manipulating those nodes, adding/removing new nodes, and selecting specific nodes, akin to jQuery.

However, as cheerio does a lot of work behind the scenes (especially in cheerio.load), the library can be deemed as expensive to use in a performance-sensitive environment. Thus, cheerio is used sparingly, and when the situation calls for it. With this in mind, I had to be creative in manipulating the DOM as best as I can without cheerio.

After some research and manual look into the DOM nodes in MarkBind, I get some understanding of the DOM nodes are structured.
A DOM node provided by htmlparser2 is generally structured like this:

{
    type: "tag",
    name: "span",
    attribs: { class: "no-lang" }
    children: [
        {
            type: "text",
            data: "This is a span",
            children: [],
            parent: ..., // Circular reference to the "tag" node
            prev: null,
            next: null
        }
    ],
    parent: null,
    prev: null,
    next: null
}

Where each properties are:

type: The type of the node, it can be "tag" for an element with HTML-tags, "text" for just a plain-text, and more. (Note that "text" nodes cannot exist alone, it must be a child of another node)
name: In the case of "tag" nodes, this is the name of the tag, such as "span", "div", etc.
attribs: In the case of "tag" nodes, this is an object to store attribute data of the tag, such as classes, ids, etc.
data: In the case of "text" nodes, this is the actual text content.
children: An array of DOM nodes which describes the node's children.
parent: The DOM node for the node's parent.
prev: The DOM node preceding this node.
next: The DOM node right after this node.

Once I understood this structure, I can start to play around with the DOM nodes.

First thing is to try to do a traversal of the node. Generally, we can do a depth-first traversal by recursing to its children. I found it very useful if you want to process continuous data such as texts with markups inside it.

Then manipulation is as simple as modifying the properties of the DOM nodes.

If you want to add new classes, you can just append the attribs.class property with the new class.
If you want to reorder the children, you can directly change the children array.
In a similar vein, if you want to add or remove child nodes, you can directly add or remove it on the children array.

However, when you play around with the actual structure of the node (like the last two above), we need to watch out several things on the references.

Set the prev value to be the node that is directly before this.
Set the next value to be the node that is directly after this.
After the two is finished, update the node's parent to the node's actual parent.

This should address the referencing pretty well for the node to be rendered properly. However, I have to note that the references are better to be handled with cheerio whenever possible, and direct manipulation of references are to be done when under a constraint.

`highlight.js`: Tokens and HTML-Preserving Quirks

In researching ways to partially highlight text in a syntax-highlighted code block, I ended up understanding part of the way highlight.js applies syntax highlighting to a text.

highlight.js breaks down the text to tokens, where each token will be wrapped as a <span> and has a specific styling prepared in the applied theme's CSS. These tokens may actually contain other tokens, in which it will be displayed as nested elements. The token names are useful if you want to target specific styling to specific types of tokens, such as adding stronger color for variables, muting down the color for comments, and so on.

Then there is a quirk to highlight.js highlighting method that it somehow preserves HTML in a code block when the text is being broken down to tokens. That is, the HTML code does not get parsed and are not considered a part of the code block. That means, one can add a custom styling HTML element wrapping a part of the text and the styling will still be displayed even though the element was parsed and styled by highlight.js.

This was a good candidate for partial text highlighting. But unfortunately, this does not apply when the language being parsed is HTML and other variants (e.g. XML). So, partial text highlighting cannot be reliably done in this way across all languages.

References:

https://github.com/highlightjs/highlight.js/issues/1561

`live-server`: The Live-reload Mechanism and Things to Know

MarkBind uses live-server to provide live-reload capabilities for users. In essence, given a configuration object, live- server starts a local development server, opens a browser tab that loads a specified file, sets up the live-reload mechanism for each page that will be opened, and listens to changes in the watched folders and issue reloads to opened tabs whenever it detects one.

During the development of a solution to , deeper research into how live-server does the

How does `live-server` provide the live-reload mechanism?

The live-reload mechanism is done by the use of WebSockets, which is a protocol that provides two-way communication over a single TCP connection. live-server leverages this as a channel between the package and the opened pages. Messages are sent through this WebSocket informing to reload the page whenever the package wants to issue a reload.

To establish the WebSocket connection and to actually reload the page, it needs a script instructing the window to do so. live-server does this by injecting an inline <script> to the file before sending a response to an HTML (or similar) files, such that this script will be run during the page load. The package is able to inject the script midway through the request-response pipeline as the package has full control of the server handling.

When the script starts to establish a WebSocket client-side, it sends a request to the server with the WebSocket protocol. The server receives this and sends a 101 Switching Protocols response, of which the package attaches a listener to and complete the establishment phase by creating the corresponding WebSocket server-side and adding it to the package's local variable that holds the current active sockets. Both WebSocket instances are coupled and make a communication channel for that page. Whenever the page is closed, the WebSockets emit a close event which is handled by the package to remove the WebSocket from the local storage variable.

When a change is detected on the folder specified in the initial configuration file, the package will iterate over the active WebSockets stored on the local storage variable and sends the message to client-side to reload the window for each WebSocket.

In essence, the establishment flow of live-reload can be summed up to:

Request-response phase: Receive request > Inject live-reload script to response > Send response, then
Document-load phase: Load document > Execute live-reload script > Establish client-side socket
Socket-initialisation phase: Receive socket request > Establish server-side socket > Send connected message

Note that the live-reload mechanism is set up after request-response phase finishes.

Things to Know about Live-reload

In the implementation of live-server there are some quirks about the live-reload mechanism, some of which is instrumental in figuring out the behaviour of the live-reload, and some I end up addressing on during the development of multiple-tab development for MarkBind.

During reload, the original socket is closed and a new one reopens.

A socket only lives until users navigate to other pages within that tab, or the tab reloads. When the tab reloads, instead of retaining the same socket as assumed, the socket will close first, and a new one will open. This behaviour becomes clearer after understanding that on reload, the window loads again, which goes back to the Document-load phase.

Reloads will be issued on any changes in the watched folder, not just related ones.

The implementation in live-server issues reloads indiscriminately on file changes. This can certainly cause user experience issues if the watched folder is volatile, we don't want the users to experience reloads every second or so. A stronger guard is required to remedy this, such that the reloads are only issued when the file that is changed corresponds to any URL that is currently opened.

There are no stored knowledge of the socket correspondences.

The package only stores the sockets without any additional data of which URL this socket corresponds to. This makes differentiating sockets harder when we want to do a running status of the URLs are currently opened through the server.

JavaScript: How to make a series with asynchronous callbacks runs sequentially

When developing multiple-tab development for MarkBind, there is a need for page rebuilds to prioritize pages that are currently opened, and of which is rebuilt in the order of most-to-least recently opened. While page generation is run asynchronously, this raises a need for a series of asynchronous callbacks to run in sequential.

One might come to a solution of using an Array.forEach, assuming that async callbacks are executed one by one. However, note the implementation of the function:

Array.prototype.forEach = function (callback) {
    for (let index = 0; index < this.length; index++) {
        callback(this[index], index, this);
    }
};

On each of the iteration, the function executes the callback and goes to the next iteration. For sequential callbacks this essentially goes one-by-one as hoped, but for asynchronous functions, it applies all the callbacks but it does not wait for it to be done in each iteration, so the execution order is not guaranteed.

Surprisingly, there are no built-in functions to do what we want. But fortunately, the solution is very simple: add an await on the callback call:

function sequentialAsyncForEach(arr, callback) {
    for (let index = 0; index < this.length; index++) {
        await callback(this[index], index, this);
    }
}

References:

https://codeburst.io/javascript-async-await-with-foreach-b6ba62bbf404

WebStorm IDE: Save Resource by Unmarking Compiled Folders

When developing the background-build system for MarkBind and testing it towards the documentation files, I noticed that the CPU usage goes high when WebStorm is in its indexing phase, which can go up to very high percentages (90-100%). The indexing happens as there are changes in the project folder. As I was testing it against the files which are within the project folder, the folder that stores the compiled files are also within the project folder, and as it keeps getting updated, WebStorm keeps doing indexing which can tax the user's machine unnecessarily.

My finding and recommendation is to unmark this compiled folder, such that WebStorm does not trigger its indexing phase, and save resources in doing so.

References:

https://www.jetbrains.com/help/webstorm/configuring-project-structure.html

`lodash`: Performance of `uniq` vs JavaScript's Set

For the case of filtering out elements and later testing some other elements against the former, we have two possible options: using lodash's uniq and test the elements with Array.some, or use the native Set and test against Set.has.

General experience might suggest the Set option as existence checks is pretty much done on constant time as compared to Array.some which is linear in the size of the array. However, the performance test done on uniq vs Set shows that even with small items the former can be up to 3 times faster. This speed difference renders the advantages of Set.has not really worth it.

With this finding, I am more inclined to use uniq when I know the filtered result is small enough (or I know the bounds of the array is small), otherwise I will use Set in general.

References:

http://measurethat.net/Benchmarks/Show/3889/0/lodash-uniq-vs-set

RepoSense

CHAN GER HEAN

Vue

Creating a new Vue project

https://cli.vuejs.org/guide/creating-a-project.html#vue-create

Things I learned about Vue:

It does not watch nested properties (of arrays, objects) in the state.

Adding Pug to new Vue project:

https://dev.to/reiallenramos/vue-pug-and-scss-27pl

Ensure Vue works in sub-paths by configuring publicPath:

https://forum.vuejs.org/t/publicpath-in-production/79698

Vue Good Practices

Use async await instead of promises.
The proper way to change a reactive deeply nested state object link
Passing a function to pass data to parents is an anti-pattern, it is better to use an $emit event to pass data.
Using store pattern for global variables.

Javascript Observations

Difference between @change and @input link
Using Buffer to encode string to Base64 link
Importing ES6 modules in HTML is quite the challenge. Turns out that I can import them directly from Skypack as seen here by using <script type="module">

Node Libraries for Vue

BootstrapVue

Useful for UI/UX design.
https://bootstrap-vue.org/docs

dotenv

Useful for secrets.
https://www.robertcooper.me/front-end-javascript-environment-variables

Vue Router

Easy to use router for Vue.
https://router.vuejs.org/installation.html

Vuex

Global store for Vue, helps in debugging.
https://vuex.vuejs.org/

Gradle

Background tasks: - Scripts started by gradle tasks are not interupted when the gradle daemon is interupted. This means that if one is not careful, one can accidentally spawn many orphan processes, leading to a memory leak. - In order to avoid this problem, one can spawn the script started by the gradle task in a seperate terminal.

Github

Nicer commit messages are beneficial even to myself, as it helps me to remember what I did previously.

Using octokit/core for GitHub's REST & GraphQL APIs

How to fork a repository link
Adding secrets (need to get public key first) link
Create or update file contents link

Setting up oauth for github authentication

https://docs.github.com/en/developers/apps/authorizing-oauth-apps
Parameters are simply query string
First redirect users to https://github.com/login/oauth/authorize
Users will then be redirected back to your site with a code query string parameter
Getting the access token requires a client_secret which should not be stored anywhere in the frontend, hence we use gatekeeper as a backend to store the client_secret.
The token recieved from gatekeeper can be used to authenticate octokit.

Github pages shows 404 when refreshing due to how Single Page Applications like Vue works. Solution is to modify 404.html to redirect to index.html.

https://www.smashingmagazine.com/2016/08/sghpa-single-page-app-hack-github-pages/

Cherry pick commits to merge (In case work on the wrong branch by accident)

https://mattstauffer.com/blog/how-to-merge-only-specific-commits-from-a-pull-request/

Splitting a subfolder out into a new repository

https://docs.github.com/en/github/using-git/splitting-a-subfolder-out-into-a-new-repository

Autodeploy to gh-pages:

Run on cmd: yarn add gh-pages
Add to package.json:

"scripts": {
    "predeploy": "npm run build",
    "deploy": "gh-pages -d build",
}

To autodeploy: yarn run deploy

HSU ZHONG JUN

GitHub Actions

GitHub Actions is a new platform hosted on GitHub for Continuous Integration and Continuous Deployments (CI/CD) purposes. Previously, RepoSense was using TravisCI for running the test suite, but due to changes in their pricing model, there was an urgent need to switch to GitHub Actions that provided unlimited minutes for open source projects.

The workflow syntax is the main starting point in designing a new CI job in GitHub Actions (known as a workflow). There are various types of events that trigger a workflow run, although a pull_request event is the most common.

However, as code from a pull request is run based on the code submitted by the PR owner, it can result in pwn requests and cause a compromise in the secrets stored in the repository. Hence, GitHub made a design decision of not sharing any repository secrets to workflow runs that are triggered by the pull_request event. If a workflow needs to run and access secrets, they will have to be triggered using the pull_request_target event.

RepoSense was vulnerable for a period of time due to this. Hence, work was done to mitigate the issue through the use of workflow artifacts.

Thus, a lesson learnt is to always write code with security in mind. The original design of the workflow was aimed at working around the limitations imposed by GitHub. However, it turns out it is still possible to compromise security with a carefully designed PR. Therefore, understanding the security implications of writing code is vital in ensuring that we write secure code.

JUnit

JUnit is a framework for writing tests for Java code. It is also a framework used by RepoSense in writing test cases quickly and easily.

When conducting research to reduce the testing time, I had to learn more about how the test suite was designed, especially with the annotations used. Thankfully, there was a tutorial on JUnit annotations which explained the sequence of functions that are run based on the annotations provided. A simplified sequence is as follows:

@BeforeClass → @Before → @Test (1) → @After → @Before → @Test (2) → @After → @AfterClass

However, as many of the test functions are spread across multiple test classes, there was a need to run certain code after all test classes were run. That is, the code needs to run after every class's @AfterClass. A StackOverflow question suggested to use JUnit's RunListener, which is a hook that can run before any test runs and after all tests have finished.

That felt like it was sufficient, but it turns out, it is possible that the tests were prematurely terminated, resulting in the code for RunListener not being able to run.

Instead, a blog post suggested:

Don’t Clean Up After Tests with RunListener.testRunFinished(), Add a Shutdown Hook Instead

Even so, it is still possible for the Java Virtual Machine (JVM) to be terminated externally, such that it is not even possible to run code from the shutdown hook. Nonetheless, it does show that despite your best efforts, there is always a way for the user to cause havoc, which is really a demonstration of Murphy's Law.

CommonJS and ES6

When doing frontend development work, the use of external libraries and packages is inevitable. However, there are different ways in which these external libraries can be included into a project, two of which are CommonJS and ES6.

CommonJS is a type of standard or specification that declares how libraries can be imported. Previously, global or shared variables will have to be shared using the module.exports object, which is inherently insecure. The other way is to load dependencies using the <script> tag, which is rather slow and prone to errors.

ES6 is another standard which indicates how JavaScript-like programming languages should be written in order to ensure web pages render the same way on different web browsers. While this is the official standard used, it is not able to support modules that were written before this standard is produced. Hence, CommonJS and other workarounds were used to allow for the same import and export features to be made available for earlier versions of JavaScript.

A comment on Reddit explains the differences between CommonJS and ES6. It also explains when to use which, although for best practices, ES6 should be used moving forward. Another article lists the various differences and goes deeper into other topics such as asynchronous loading of modules, etc.

Going deeper into this topic is useful for understanding the JavaScript programming language better, and to understand the differences between server-side JavaScript and frontend JavaScript.

GitHub API

The GitHub REST API provides many endpoints for interacting with various aspects of GitHub programmatically. It also allows using GitHub in a way that the web interface does not offer.

The API was first used in RepoSense as part of the migration from TravisCI to GitHub Actions earlier in the semester. At that time, we wanted to set up pull request statuses in a way that allows for marking the preview websites as deployments. This required interacting with the Deployments API that is part of repositories, so that a nice "View deployment" button can appear for pull request reviewers to click on.

It is important to note that the GitHub API documentation is very incomplete. The repository is available for anyone to make a contribution to, but there are many features that are "hidden" and takes a while to find. For instance, as part of the deployment status, a special Accept header needs to be specified, known as preview notices. However, I encountered a situation where I needed to specify multiple preview notices, but this information was not provided in the GitHub documentation.

After some trial-and-error, I figured the way to indicate this, which is to separate them by a comma:

Accept: application/vnd.github.flash-preview+json,application/vnd.github.ant-man-preview+json

This allows for both application/vnd.github.flash-preview+json and application/vnd.github.ant-man-preview+json to be used, which allows for in_progress & queued statuses, as well as the inactive state to be used.

HUANG CHENGYU

Tool/Technology 1

Vue.Js

Documentation:

https://vuejs.org/v2/guide/index.html

Introduction tutorial:

https://www.vuemastery.com/courses/intro-to-vue-js/

Vuex

Description of the tool: State management tool for Vue applications

Aspect: Introduction to Vuex

Documentation:

https://vuex.vuejs.org/guide/

Introduction tutorial:

https://www.youtube.com/watch?v=5lVQgZzLMHc

Vue LifeCycle Management

Description of the tool: Vue component life cycle hook

Aspect: Methods that can be used to create hook between Vue component and the template, enabling the pug file template to load information before rendering, and the Vue component to access information in the pug file template before and after its initial rendering.

created is a useful method for loading the data and retrieving the window hashes that are needed for the vue component after the component is created.

beforeMount is often used to access and modify the information of the component before the rendering.

mounted is called after the rendering and it is also used to access or modify the information of the component.

It is important to distinguish between create and beforeMount and mounted.

https://medium.com/@akgarg007/vuejs-created-vs-mounted-life-cycle-hooks-74c522b9ceee

Documentation:

https://vuejs.org/v2/guide/instance.html#Instance-Lifecycle-Hooks

Vue Computed Properties and Watchers

Description of the tool: Respective usage of computed properties and watchers in Vue component

Aspect: There is difference between the computed properties and watchers in Vue component. Computed property is often used to compute value based on declared variable, while watcher is often used to detect change in watched object and make the response respectively.

Documentation:

https://vuejs.org/v2/guide/computed.html

Tool/Technology 2

SCSS and CSS

Description of the tool: style sheet used by Vue User Interface Component

Aspect: Difference between the usage of class selector in css and scss style sheet

Documentation:

https://www.w3schools.com/cssref/css_selectors.asp

https://stackoverflow.com/questions/11084757/sass-scss-nesting-and-multiple-classes

https://stackoverflow.com/questions/30505225/how-to-reference-nested-classes-with-css

https://stackoverflow.com/questions/13051721/inheriting-from-another-css-class

Related article about the naming convention in css style sheet:

https://www.freecodecamp.org/news/css-naming-conventions-that-will-save-you-hours-of-debugging-35cea737d849/

Tool/Technology 3

Pug

Description of the tool: template used to render the Vue component

Aspect: Difference and similarities between Pug and Html

Documentation:

https://pugjs.org/api/getting-started.html

Introduction tutorial:

https://www.youtube.com/watch?v=kt3cEjjkCZA

Tool/Technology 4

Figma

Description of the tool: Grapgical User interface design tool that allows automatic generation of css style sheet. This can save time for writing css according to the User Interface Design

Aspect: Introduction to the usage of Figma

Introduction tutorial:

https://www.youtube.com/watch?v=3q3FV65ZrUs

Tool/Technology 5

JavaScript Syntax

Description of the tool: JavaScript Syntax related to the retrival of map keys, conversion from map to array, and iteration of map

Aspect: Some instance methods of retriving key and entry in the Map class, such as map.keys and map.entries, do not seem to work in RepoSense frontend script. The corresponding class methods in the Object class, such as Object.entries and Object.keys, can be used as an alternative option.

https://stackoverflow.com/questions/35341696/how-to-convert-map-keys-to-array

https://stackoverflow.com/questions/16507866/iterate-through-a-map-in-javascript/54550693

https://stackoverflow.com/questions/42196853/convert-a-map-object-to-array-of-objects-in-java-script

Documentation:

https://devdocs.io/javascript-map/

https://devdocs.io/javascript-object/

Tool/Technology 6

Java URL decoder

Description of the tool: The Java URL decoder to decode string

Aspect: The Java URL decode can convert hexadecimal value in the string to the corresponding character, which is helpful in path string conversion.

Documentation:

https://docs.oracle.com/javase/8/docs/api/java/net/URLDecoder.html

Tool/Technology 7

Git command for squashing commit

Description of the tool: The git command to squash multiple commits

Aspect: It is necessary to squash multiple commits together when working on pull request, which can make the commit history look concise and clear to the reviewer.

https://stackoverflow.com/questions/5189560/squash-my-last-x-commits-together-using-git

Documentation:

http://git-scm.com/docs/git-rebase#_interactive_mode

http://git-scm.com/docs/git-reset#Documentation/git-reset.txt-emgitresetemltmodegtltcommitgt

Tool/Technology 8

Cloning of Github repositories

Description of the tool: Cloning of GitHub repositories with SSH

Aspect: The steps to take to connect to GitHub with SSH and clone repository with SSH protocol URL. It is a potential direction to explore for RepoSense to allow the usage of SSH protocol URL when specifying the repository to clone.

https://stackoverflow.com/questions/6167905/git-clone-through-ssh/16134428

Steps to take to connect to Github using SSH:

https://docs.github.com/en/github/authenticating-to-github/connecting-to-github-with-ssh

Cloning of repository with SSH:

https://docs.github.com/en/github/getting-started-with-github/about-remote-repositories#cloning-with-ssh-urls

ROLAND YU WENYANG

Pug

Aspect	Resource Description	Resource Link
Introduction to Pug	Beginner's Guide to Pug - Useful to understand history and basic syntax	https://www.sitepoint.com/a-beginners-guide-to-pug/
Real time online converter from Pug to HTML	Converter - Quick way to understand Pug code for beginners	https://pughtml.com/

Vue

Aspect	Resource Description	Resource Link
Introduction to Vue	Vue Fundamentals	https://vueschool.io/courses/vuejs-components-fundamentals

Cypress

Aspect	Resource Description	Resource Link
Basic tips	Beginner's Guide to Pug - Useful to understand history and basic syntax	https://www.ulam.io/blog/cypress-testing-tips-and-tricks/

PlantUML

Aspect	Resource Description	Resource Link
Reference Guide	Description and syntax of PlantUML diagrams - A useful cheatsheet	https://deepu.js.org/svg-seq-diagram/Reference_Guide.pdf

Tool/Technology 4

List the aspects you learned, and the resources you used to learn them, and a brief summary of each resource.

TEAMMATES

ADITHYA NARAYAN RANGARAJAN SREENIVASAN

Angular

Typing in TypeScript, Components, Modules and File Architecture, Data Binding, Services, Dependency Injection
An Angular app typically has service classes to handle communication with backend server and local storage. RxJs is usually used to create async services
Angular Modules: to aid dependency injection among components
Angular Models: Used for automatic data binding which changes the view when the model changes
Structural Derivatives: ngIf and ngFor for templating Source: Angular Documentation, Tour of Heroes Tutorial

Jest

Unit testing, component testing, snapshot testing, mocking services and API calls
- Common matchers: exepct(), toBe(), not, toEqual()
- Truthiness: toBeTruthy(), toBeFalsy(), etc
- Number comparision: toBeGreaterThan(), etc
- String comparition: toMatch()
- Arrays: toContain()
- Exceptions: expect() + toThrow()
- Mocking services: spyOn() + returnValue()

Source: Jest Documentation, TEAMMATES Codebase

Objectify

API for effective querying of entities such as filtering and member projection. Used to group and eliminate fields.
- Defining entities: annotate a class with @Entity (other tags like @Cache can also be used if caching is needed)
- Base operations: save(), load(), etc

Source: Objectify Documentation

Google App Engine

Indexing properties of an entity for searching, filtering and projecting in the queries
- Use @Index to annotate the attribute used as index
- GAE enforces usage of efficient queries and makes it almost impossible to write slow queries

Source: Google App Engine Java Documentation

RxJS

Parallelize dependent API calls using map, zip and mergeAll
map: takes in a parameter project to apply to each value emitted by the source Observable
zip: combine multiple observables into 1 observable
mergeAll: convert a high-order observable into a first-order observable that delivers all values that inner observables emitted Source: RxJS Developer Guide

Google Cloud Logging API

Filtering of logs using labels after listing the entries
- Use GCP client library google.cloud.logging
Mocked a local logs service using an in-memory model

Source: Google Cloud Logging Java Documentation

Selenium E2E

Identify web components using css selector, classname and id
- Use annotator @FindBy and method findElement
Simulate user events like click and scroll using web driver
- Use method click with the previously found web elements as parameters
E2E stubbing
- Create a stub service to receive user events and log data to overcome GCP limitation on the development server

Source: TEAMMATES Codebase

DAO NGOC HIEU

Angular

Aspects: Components, Templates, Directives, Dependency Injection Source: Official Docs

An Angular app usually has 3 main types of component: UI, service and module. RxJs is heavily used in service classes to help UI classes communicate with the backend server
Learnt about ngTemplate to make reusable widgets
Learnt about specificity of CSS to change component styles without directives or Typescripts code changes
Learnt about @Input and @Output decorators for communication among the UI components:
- @Input is usually used for the parent component to pass data to the child component.
- @Output is useful when a child component wants to communicate witht the parent component. An EventEmitter is usually decorated with this decorator to act as a callback for changes in the child component.
Learnt about modules in Angular: a component can be treated as a standalone module or as a part of a parent module. Modules aid dependency injection in Angular

RxJs

Learnt about observable to receive APIs responses
Learnt about concat and merge to call paginated API requests:
- concat is used for synchronous API stream, which is useful where API calls need to be in order and slow performance is tolerable
- merge is used for asynchronous API stream, which is useful where performance is important and a burst of instances (due to a large number of asyn calls) is guaranteed to not happen
Learnt about catchError to handle errors without ending the stream of API requests allowing the FE to retry sending requests upon failures such as timeout:
- catchError returns a new stream when an error is encountered. The stream can be ended by returning an empty stream or continue by reconstructing the original stream
Learnt about various concepts in RxJs:
- Schedulers: centralized controllers for concurrency which allow the coordination of computation
- Subject: equivalent to EventEmitter
- Purity: RxJs allows usage of pure function to generate stateful values through pipe
- Flow: RxJs provides various flow controll such as throttleTime, filter, delay. These are usually used through pipe before subcribe for more robust behaviours

Google App Engine

Aspects: Database access and modification, APIs, Cron jobs, Task Queue. Source: Teammates code base, Official Docs

Learnt about GAE timeout and its response to timed-out events which can be utilized to recover from such events
Learnt about appengine-web.xml to configure environment variables such as Java runtime version and scaling policy

Java

Learnt that high-level data structures should be prioritized over primitive ones to make the code more extendable e.g ArrayList should be used instead of array
Revised best programming practices and OOP principles in Java

Objectify

Learnt about CRUD actions on the datastore
Learnt about server-side filtering feature
Other basic operations:
- Defining entities
- Loading/Deleting/Updating entities
- Key-only queries
Transactions:
- Entity groups: to allow atomic transactions to be carried out. The groups are defined based on each entity's parent
- Optimistic concurrency: Objectify allows concurrent transactions, and any conflicts in timestamps of entities will cause the transaction to be rolled back. Optimistic concurrency generally has a higher throughput than pessimistic concurrency
Indexing:
- To index an entity with an attribute, use the tag @Index (@Cache can also be used to cache entities)
- Appengine indexing rules enforce the use of efficient queries, and make it almost impossible to write slow queries

Google Cloud Platform

Learnt that parallelism can cause mutiple instances to be created and thus increase the cost. Thus, large requests are broken down into smaller ones and run synchronously to avoid a burst of instance creation
Used API services such as Gmail API to send notifications to customers and production logs to admin
Used Cloud Trace to find performance bottle-neck of API calls
Used Cloud Profiler to identify and analyze performance bottlenecks
Used staging server to simulate a production environment for stress testing on performance bottlenecks

Backend Techniques & Testing

Paginatied API calls: break a huge request down into smaller requests. This allows for more data to be sent at the cost of additional performance overhead
Created sripts to generate mock data and retrieve information about the database for the admin
Created LnP test scripts to assess performance of heavily used APIs such as student enrolment
Configured test.properties and JMeterElements.java for running LnP scripts against staging server by adding domain and port of the staging server as target point
E2E testing with Selenium:
- Learnt about the usage of id, xpath and css selector to identify web components for testing purposes
- Learnt about the usage of page objects to simulate user actions
- Learnt about the usage of service stubs to simulate http responses
Learnt about JMeter as an application for LnP testing:
- Create test data
- Create csv test data to be send with APIs to BE
- Construct LnP test plan:
  - Add csrf token and login sampler for CRUD actions
  - Add request headers
  - Add http sampler to carry out the request to BE with created test data
  - Run the test concurrently by configuring the number of threads
- Remove data from datastore and delete test files after the test

LI JIANHAN

Front-End Knowledge

Angular

Basics

Components are the building blocks of the web application. The component class is where data is temporarily stored and loaded into the HTML pages. Components handle user interactions e.g. what happens when a user clicks a button. Components form the bridge to other parts of the application e.g. backend, via services e.g HttpRequestService, AccountService.

Aspects Learned

Understanding the component lifecycle
Understanding the underlying web technologies (HTML, CSS)
Injecting data into webpages
Understanding event emitters

Resources: Intro to Angular Components, Component Lifecycle

Angular CLI

The Angular CLI allows the developer to quickly create components, modules and tests. Learning to use the CLI helped me to improve my workflow.

Resources: Angular CLI tools

Routing

Aspects Learned

Defining new routes
Configuring the routing module

Resources: Angular Routing Guide

Front-End Testing

Front-end testing is done using Jest/Jasmine which allows us to perform snapshot-testing. Snapshot testing is a technique that ensures that the UI does not change unexpectedly.

Aspects Learned

Snapshot Testing
Mock functions (spyOn, callFake)

Resources: Snapshot testing, Jest Object, Jasmine callFake

RXJS

Aspects Learned

Perform async operations via Observable
Understanding RxJS operators

Resources: RxJS Observables, Subscription, RxJS Operators

Back-End Knowledge

Objectify

Aspects Learned

Defining and registering an Entity (corresponding to a single POJO class)
Basic operations (e.g. save(), delete(), save())
Learnt about key-only queries and its benefits
Understanding the key format and how it has changed from Objectify v5 to v6
Understand the usage of key.toUrlSafe() and key.toLegacyUrlSafe()
Understanding Query cursors
Configuring OfyHelper for development and production environments
Understanding how LocalServiceTestHelper and LocalServiceTestConfig are used in testing against local app engine services

Resources: Objectify Concepts, Entities, Objectify v6, Query Cursor LocalServiceTestHelper

Google Cloud Datastore

Aspects Learned

Deploying to a staging server on GAE
Setting up, running and connecting to a Datastore Emulator
Exporting data from a staging server
Importing data into Datastore Emulator for backward compatibility testing
Testing different versions against staging server

Resources: Running the Emulator, Export and Import Data

E2E Testing

Aspects Learned

PageObject Design Pattern

Resources: PageObject Design Pattern

Solr

Solr is an open source enterprise search platform built on Apache Lucene. One major feature of Solr that TEAMMATES tries to leverage on is full-text search.

What is full text search?
Full text search

is a more advanced way to search a database
finds all instances of a term (word) in a table without having to scan rows and without having to know which column a term is stored in
works by using text indexes, which stores positional information for all terms found in the columns you create the text index on
is term-based, not pattern-based

Resources: Full Text Search

Aspects Learned

Running and configuring Solr as a local search service
Providing Solr as a service to the backend using SolrJ Java Client
Indexing search documents in Solr
Understanding Collections, Sharding, Cores, Replicas
Running search service in docker
Query paramters and Filter Queries in Solr
Adding Copy Field rules to schema

Resources: Solr Local Setup, Basics of Apache Solr, Terminologies, Docker/Solr, Filter query, Copy Field

Github Actions

Aspects Learned

Github actions and changing the workflow for CI
Learning YAML (e.g block scalars, variables, syntax etc)

Resources: Github Actions, YAML

Regex

Aspects Learned

Filtering query strings with regex

Resources:
Regex hands-on/cheatsheet

LIM ZI WEI

Jest

Mocks or spies in Jest allow one to capture calls to a function, and replace its function implementation or return values. It is an important technique in isolating test subjects and replacing its dependencies.

In our front-end tests, we use mocks in unit tests to ensure that our service methods are called correctly with the proper parameters. I learned to create mocks by using spyOn, which are then chained with methods such as returnValue or callFake to delegate calls from the spy to the mock function or return value.

Jest allows for snapshot testing with toMatchSnapshot, which can be automatically updated when components change. Jest also allows one to test asynchronous code involving callbacks or promises; for instance, some of our tests were modified to support callbacks by taking and using an additional done argument.

Resources:

Jest documentation: provides an understanding of the various forms of testing, including mocks, asynchronous code handling, snapshot testing, and more.

Angular

Angular is a framework written in TypeScript and used for the frontend of TEAMMATES. Angular makes use of modules and components, whereby each component instance comes with its own lifecycle. Hooks can be used for each event in the lifecycle to execute the appropriate logic when events such as initialization, destruction, or changes occur.

Components make use of services that help to provide functionality; within the context of TEAMMATES, these can be API services that make the relevant calls to the backend.

Resources:

Angular tutorial: Tour of Heroes: simple introduction that covers the key topics and concepts of Angular in a single application.

Objectify

The Objectify API as used in TEAMMATES allows us to define new entities corresponding to the key/value layout within the Google App Engine datastore, and query for them or modify and delete them.

As for the queries, it is helpful to note that keys-only queries can be effectively free, as they only return the keys of the result entities without needing to return the entire entities themselves. This can be helpful as reading and writing costs are both non-negligible in large-scale applications.

Resources:

Objectify Wiki

Google App Engine

GAE is the cloud computing platform used in TEAMMATES. It bundles together many services, some of which have since been released as standalone services.

To manage and test logging (unavailable on the development server), I deployed TEAMMATES to a private staging server, which involved learning the use of gcloud services to setup and monitor application activity. Cron jobs were also learned and modified to suit our purposes through editing cron.xml.

Resources:

Google Cloud Logging

Cloud Logging is one of the services offered under GCP, and a key service to be used in the audit logs feature of TEAMMATES. As costs are key in a large-scale application, learning the pricing information regarding our logging services is important so as to pre-empt operating costs and design our features in such a way as to reduce them.

On staging, I performed some tests regarding our new audit logs, and then metrics were monitored through the cloud console; such metrics include metrics on log usage among others.

Resources:

Google Cloud Logging: code samples: references for using Cloud Logging in Java
Google Cloud Operations Suite: pricing: contains information on pricing for logging and monitoring

CSRF tokens

Cross-site request forgery (CSRF) is a form of malicious attack, whereby a user with access privileges is unknowingly tricked into submitting an illicit web request.

Traditionally, these can be protected against with methods such as CSRF tokens. A unique valid CSRF token value is set for some request, typically in a hidden form field, such that only requests from the particular web application will be considered valid.

However, X-CSRF tokens serve as another alternative. The server can send a cookie setting a random token, which is received and read by JavaScript on the client-side. The token is then sent with each subsequent request through a custom X-CSRF-TOKEN HTTP header, which is validated by the server.

This technique is used by frameworks such as Angular, and used in TEAMMATES as well. By default, we use these protections with non-GET requests such as POST and PUT, for instance in the POST requests for creating new audit logs.

Resources:

Timezone configuration

As our front-end and back-end use different libraries and databases when obtaining timezone information, and these timezones can change over time, we need to keep those two databases in sync. Both rely on IANA's tz database as the authoritative source for timezone information, but may be out-of-date and require manual updating when IANA releases a new timezone version. The front-end and back-end databases can be updated through means such as the grunt tool with moment-timezone, or the tzupdater tool.

Resources:

IANA Time Zone Database
TEAMMATES ops maintainer guide: further information on conducting timezone database updates

MO ZONGRAN

GAE platform

Hands-on experience with managed cloud resources after deploying TEAMMATES to own staging server.

Learnt about managing resources in GAE
Learnt about deployment process and also scripting
Learnt about use of profiler to generate flame graph for observability
- GAE Java 8 by default already installed profiler in the instance; just need the following in /src/main/webapp/WEB-INF/appengine-web.xml and apply some workload

    <env-variables>
        <env-var name="GAE_PROFILER_MODE" value="cpu,heap" />
    </env-variables>

Learnt about changing instance class in appengine-web.xml. Refer to instance classes available link

GAE Related knowledge.

Objectify 5 and prior uses Memcache by default. After GAE runtime upgrade which brings about update of Objectify 6, there will not be a default cache used. https://github.com/objectify/objectify/wiki/Caching
Cache data decoding (deserialization) forms the main bulk of the backend memory consumption right now and, due to the managed memory of Java language, creates high memory pressure during some tasks such as feedback response result fetch. This cannot be resolved as long as we are using a cache layer before persistent storage.

Open Source Software Challenges

Different from the (mostly) free compute resources in the enterprise, open sourced project cares about cost of every operation that's charged. It has a totally different need from the enterprise applications.

Learnt about real life non-profit open sourced project cost challenge
Learnt about consideration involving cost saving in decision making
User centric decision making
Learnt to prioritize tasks, e.g. bugs that compromise agreement with users, over nice-to-have features

Frontend E2E Testing via Selenium

TEAMMATES E2E testing is built on Selenium, which starts a browser driver (Chrome or Firefox so far) and emulate a user's workflow to test the input and the correctness of the webpage behaviour.

Configure local e2e testing environment by downloading chromedriver and connect to it in test.properties
Essentially, a browser is fired up and the test platform emulates the user input, read the resultant webpage, and check if the elements inside are expected.
In E2E testing, the webapp must be built as only the backend server running will be tested, which also serves the webpage.

Java

Some new Java knowledge learnt through experimentation

Java memory reclamation by servlet is a lot faster than that within a running servlet. The direct effect is that: fetching the same data, a single servlet will result in memory spike and lingering memory residue after the call failure; whereas a series of paginated calls would result in a stable memory increment that plateaus
Parallelism cannot be overused especially in Java where the memory is managed and GC-ed. The resources will reach contention very fast due to the generous memory allocation.
Link ArrayList apparently consumes significantly less memory than LinkedList. Choose the list implementation wisely. For non-performance critical storage, maybe ArrayList is better in all cases.
Running a Java script (not exactly a script but a convenient class) by adding the following in the gradle file:

task execScript(type: JavaExec) {
    classpath = sourceSets.test.runtimeClasspath
    main = "teammates.client.scripts.YourScriptName"
}

TAN CHEE PENG

Testing in Angular

Aspects: component testing, snapshot testing, unit testing and mocking.

Resource: Angular Documentation

GitHub Auth Operations

Aspects: Github is deprecating all password authentication in favour of token-based authentication by Aug 2021. Affects CLI for git. Customization of access controls with PATs

Resource: GitHub Blogs

Apexcharts

Aspects: FOSS charts library for web pages.

Resource: Apexcharts Website

Google Analytics

Aspects: Tracking of page views, events, costs associated with usage of analytics and opt-out privacy policies

Resource: Google Analytics Documentation

Google Cloud Logging

Aspects: Logs API(GAE platform), Cloud Logging(GCP platform), differences between the two and how cloud logging on the upgraded GCP platform is more powerful than GAE, and logs-related quotas

Resource: Log API, Cloud Logging

Cloud Logging Labels

Aspects: Cloud logging labels that allow user-defined labels, their limitations and defaults

Resource: Using log-based metrics

Angular testing for window object

Aspects: window object is frequently used in the codebase of TEAMMATES, e.g., window.location.href. The usage of Angular as a framework with jest testing means that window object should usually not be accessed directly. Testing of the window object in unit tests then require some APIs and dependency injection techniques

Resource: InjectionToken, Injecting Window to Angular

Proxy Pattern

Aspects: Structural design pattern that allows a placeholder for another object type. Mainly used for controlling access to the original object, allowing actions before requesting original object. Used in TEAMMATES backend as a layer over Google Cloud Task API and CLoud Logging API. Allows for mocking of 3rd party API services that we do not have direct control over for unit testing.

Resource: PR for Proxy Pattern

Data Warehouse

Aspects: Uses of a data warehouse, purpose of a warehouse and architecture of it. Differences between a data warehouse and a database.

Resource: AWS Data Warehouse Concepts

New HTML Tags

Aspects: The <datalist> tag and how it is used to support autocomplete as well as combining text inputs with select dropdowns

Resource: HTML datalist Tag

TAN TEIK JUN

I have consolidated my learning into a blog here: https://teikjun.vercel.app/

For easy reference, here are quick links to each of the blog posts. Click on them to find out more ⬇

ZHUANG XINJIE

GAE First/Second generation runtime

Differences between Java 8 and 11 runtime
Necessity to separate third-party services such as search and storage services
Changes in the deployment workflow using .yaml
Changes in testing/CI setup: LocalDatastoreHelper and Solr service instance

Apache Solr search

How to translate GAE search API schema into Solr schema
Variations of HTTPClient that Solr supports for different situations (update-centric/query-centric)
Full text search across fields by enabling "Copy Field"
Differences between schema-less and schema mode in Solr entity
Various tokenization policies and their nuances in parsing query string

Resources:

Full-text search

Definition
Application in modern search engine
Solr Java client "SolrJ"
Advantages of using search engine for read-only operation

Resource:

Cloud Storage

Understand various aspects of the Google Cloud Storage such as:
- Blob storage
- KeyFactory
- ancestor filter and non-ancestor query
- datastore query data consistency
- search API concepts and practices (document, index, query, cursor...)
- GCP data buckets import & export

Resource:

Datastore emulator

understand the usage of local datastore emulator for testing
usage of LocalDatastoreHelper in Cloud Storage unit test setup
data import/export between the emulator and staging server

Resource:

Cloud Logging

Understand the internal working of Google Cloud Logging.

Resource:

Objectify v5, v6

Common operations on entity schema, queries and migration
Interactions with Google Cloud Datastore in different versions (v5, v6)
Objectify unit test environment setup
Cloud storage key update
Learned incompatible issues between versions and workarounds

Resource:

Adaptive and responsive design

definition, use cases
tradeoff and how to make choice

Resource:

Responsive Design vs. Adaptive Design: What’s the Best Choice for Designers?

GitHub Action

how to configure CI workflow to the existing project

Resource:

GitHub Action documentation

Angular test framework

available components and best practices to test Angular components

Resource:

the official Angular documentation

CSS

different use and functionalities of flexbox layout

Resource:

CSS-Tricks

Git

Problem: permission denied when pushing to remote repo

Solution:

git remote rm origin
git remote add origin git@github.com…git (SSH key)
cat ~/.ssh/id_rsa.pub (view key)
https://github.com/settings/ssh (add key into GitHub account)

Resource:

Stackoverflow

Problem: delete undesirable commit history

Solution:

Git reset --hard <SHA>

Resource: Stackoverflow

GitHub

Problem: "refusing to allow an OAuth App to create or update workflow" on git push

Solution: we need to enable workflow scope on our GitHub repository to allow any modifications on GitHub Action in the project. Otherwise, the request will likely be declined.

Resource:

Stackoverflow

Java/OS

How to switch between Java versions on the PC for working with different project environment?

Resource:

Medium article