Skip to content

Implemented Containerization of the Lyra Webapp #157

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 47 commits into
base: main
Choose a base branch
from

Conversation

henrycatalinismith
Copy link
Collaborator

@henrycatalinismith henrycatalinismith commented Feb 1, 2025

Description

Added containerization to the lyra project to be able to run the whole thing inside of a container.
Read through the Docker setup section within the README.md for more details on how this works.

@xela1601
Copy link
Collaborator

xela1601 commented Feb 2, 2025

Copying over @henrycatalinismith 's PR description here to not override it:

Not fully working yet but lots of annoying obstacles cleared so far. Here's the error message as of the end of the day.

--- ~/lyra ‹main* M› » make docker_run
docker run -it \
		-v /Users/henrycatalinismith/lyra/lyra/config/docker.yaml:/app/config/projects.yaml \
		-v /Users/henrycatalinismith/lyra/test-001:/projects/test-001 \
		-p 3000:3000 \
		lyra
  ▲ Next.js 14.2.15
  - Local:        http://769f01fc1fc2:3000
  - Network:      http://172.17.0.2:3000

 ✓ Starting...
 ✓ Ready in 64ms
loading
a [Error]: unable to fork
    at Object.onError (/app/webapp/.next/standalone/webapp/.next/server/chunks/871.js:26:44708)
    at /app/webapp/.next/standalone/webapp/.next/server/chunks/871.js:26:18797
    at new Promise (<anonymous>)
    at W.handleTaskData (/app/webapp/.next/standalone/webapp/.next/server/chunks/871.js:26:18556)
    at W.<anonymous> (/app/webapp/.next/standalone/webapp/.next/server/chunks/871.js:26:18181)
    at Generator.next (<anonymous>)
    at o (/app/webapp/.next/standalone/webapp/.next/server/chunks/871.js:26:4300)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  task: {
    commands: [ 'pull' ],
    format: 'utf-8',
    parser: [Function: parser],
    onError: [Function: onError]
  },
  git: ef {
    remote: '',
    hash: { local: '', remote: '' },
    branch: { local: '', remote: '' },
    message: 'unable to fork'
  },
  digest: '4075981265'
}
a [Error]: unable to fork
    at Object.onError (/app/webapp/.next/standalone/webapp/.next/server/chunks/871.js:26:44708)
    at /app/webapp/.next/standalone/webapp/.next/server/chunks/871.js:26:18797
    at new Promise (<anonymous>)
    at W.handleTaskData (/app/webapp/.next/standalone/webapp/.next/server/chunks/871.js:26:18556)
    at W.<anonymous> (/app/webapp/.next/standalone/webapp/.next/server/chunks/871.js:26:18181)
    at Generator.next (<anonymous>)
    at o (/app/webapp/.next/standalone/webapp/.next/server/chunks/871.js:26:4300)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  task: {
    commands: [ 'pull' ],
    format: 'utf-8',
    parser: [Function: parser],
    onError: [Function: onError]
  },
  git: ef {
    remote: '',
    hash: { local: '', remote: '' },
    branch: { local: '', remote: '' },
    message: 'unable to fork'
  },
  digest: '4075981265'
}

@xela1601 xela1601 changed the title Dockerfile progress for the day Implemented Containerization of the Lyra Webapp Feb 2, 2025
@xela1601 xela1601 closed this Feb 2, 2025
@xela1601 xela1601 force-pushed the issue-156/dockerfile branch from 4801da3 to 7c8e25b Compare February 2, 2025 11:31
@xela1601 xela1601 reopened this Feb 2, 2025
@xela1601 xela1601 marked this pull request as ready for review February 2, 2025 11:34
@xela1601 xela1601 force-pushed the issue-156/dockerfile branch from 790557d to ed66bc6 Compare February 2, 2025 11:48
@xela1601
Copy link
Collaborator

xela1601 commented Feb 2, 2025

there's still some issue with the unit tests that needs to be addressed

@xela1601
Copy link
Collaborator

xela1601 commented Feb 2, 2025

Also i just noticed that the publishing of translation changes is not working at the moment.

 Error while creating pull request: Error: fatal: ../zetkin-fork-for-lyra/src/locale/da.yml: '../zetkin-fork-for-lyra/src/locale/da.yml' is outside repository at '/app/zetkin-fork-for-lyra'

This is some weird issue with the paths that are being used throughout the webapp.
I'd propose we just configure all the paths being used in one central location...

What do you think?

@WULCAN
Copy link
Collaborator

WULCAN commented Feb 2, 2025

Also i just noticed that the publishing of translation changes is not working at the moment.

 Error while creating pull request: Error: fatal: ../zetkin-fork-for-lyra/src/locale/da.yml: '../zetkin-fork-for-lyra/src/locale/da.yml' is outside repository at '/app/zetkin-fork-for-lyra'

This is some weird issue with the paths that are being used throughout the webapp. I'd propose we just configure all the paths being used in one central location...

What do you think?

That looks like a bug in Lyra. If I recall correctly, we convert to absolute paths pretty early after reading projects.yaml but we must have made a mistake somewhere. Will you create a separate issue for this bug?

I'm not sure exactly what you are suggesting with configuring all paths in a central location. Our configuration file projects.yaml could be considered that already.

Copy link
Collaborator

@WULCAN WULCAN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed all the changes and leave my notes here. I haven't done any testing yet.

I haven't concluded what needs fixing and what can be deferred to later, except for the one about copying config into the container image. That one seems like a serious issue to me.

@xela1601 xela1601 force-pushed the issue-156/dockerfile branch 2 times, most recently from 6d4350b to 38613d7 Compare February 2, 2025 19:56
@xela1601 xela1601 force-pushed the issue-156/dockerfile branch 3 times, most recently from dc564fe to 23d3297 Compare February 15, 2025 21:13
henrycatalinismith and others added 6 commits February 15, 2025 22:35
Signed-off-by: Alexander Schreiner <[email protected]>
Signed-off-by: Alexander Schreiner <[email protected]>
since it contains the github token which should not be inside of the image

Signed-off-by: Alexander Schreiner <[email protected]>
- improved logging
- introduced dynamic .yml and .yaml support for configuration
- added host projects config
- only copying webapp folder into builder stage
- bumping up to node version 23
- adding git username/password to docker env variables
- adjusted README

Signed-off-by: Alexander Schreiner <[email protected]>
@xela1601 xela1601 force-pushed the issue-156/dockerfile branch from 23d3297 to 83f84ef Compare February 15, 2025 21:38
@@ -1,4 +1,6 @@
/** @type {import('next').NextConfig} */
const nextConfig = {};
const nextConfig = {
output: 'standalone',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With standalone output, Next can miss required dependencies from the monorepository root. Next documents that we need to configure outputFileTracingRoot to work around this.

https://nextjs.org/docs/14/app/api-reference/next-config-js/output#caveats

While our monorepository root only has devDependencies, one of those dependencies is typescript, which is actually a runtime dependency of webapp:

import ts from 'typescript';

Somewhat fortunately, we observe that standalone contains the required typescript in node_modules and we can also see that it includes the monorepository root package.json. Note that we get a webapp/server.js instead of the expected server.js. My interpretation is that Next has an undocumented feature, or perhaps a bug, making Next consider the whole monorepository, even when next build is run from inside the subproject.

vercel/next.js#72436

It's unclear what will happen if we follow the documentation by configuring outputFileTracingRoot when Next already has behaviour of considering the monorepository. I think either way, there's risk of our setup breaking when we upgrade Next or when we add more projects to our monorepository.

Resolving this upstream with Next is likely going to be slow, given how the issue linked above was handled.

If we could add a comment near our dependency declaration for Next, we could warn an upgrading developer. I do not believe it is possible to add comments near the dependency declaration in package.json. What we can do instead is add a section about it to README. There's a low probability of that being read by an upgrading developer but it will still be a lot more visible than a discussion in an old pull request.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WULCAN

i was just wondering why we need this webapp module within the monorepo anyway.

Or to ask the question in another way: why can't we just have one package.json and merge all the stuff from webapp into the repository root 🤔

This would definitely fix that issue but will require some more changes.
Either way it would be good to know what was the reason for this decision in the first place.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WULCAN
i adjusted the next.config.js and updated the README accordingly in 1e781b5

Copy link
Collaborator

@WULCAN WULCAN May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the conversations about #35 have been lost in slack history.

I believe we have some ideas about a future Lyra library and other Lyra tools than the web application. In that future, a shared version control repository for these different projects might be useful. @amerharb might remember more.

@henrycatalinismith
Copy link
Collaborator Author

henrycatalinismith commented May 25, 2025

Here are the remaining unresolved discussions.

return new ServerProjectConfig({
baseBranch: project.base_branch ?? 'main',
githubToken: project.github_token,
host: project.host,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest to have a default value of github.com in the same way that we have main to baseBranch

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 786feed @amerharb

@WULCAN
Copy link
Collaborator

WULCAN commented May 25, 2025

I have reviewed Henry's a2ff94a , 97e8659, bc0b0aa and 5ec7e77 and approve of them.

I also double checked his amazing list of remaining issues above.

WULCAN added 5 commits May 25, 2025 15:42
The value for the parameter translation is entirely controlled by
an unauthenticated client on Internet. Reflecting that into logs
allows them to spoof other log message too, for example by using
newline characters in the value.

It might even give them control over the administrator's terminal
using control characters.
Prior to validation, languageName is entirely controlled by an
unauthenticated client. Logging its value allows them to inject
arbitrary content into logs, including but not limited to spoofed
log messages and control characters.
This log line reflects unauthenticated client content into logs.

If accessLanguage fails, either Next will log the exception or our warn
below will log similar info as this line. If accessLanguage succeeds,
our info will log similar info to this removed line.
accessLanguage only returns false-like if the project was not found.
Then the languageName is irrelevant.

languageName is entirely controlled by an unathenticated client at this
point. Logging its value allows arbitrary log injection.
For this log line, the value is entirely controlled by
an unauthenticated client. Reflect that into logs allows
arbitrary log injection.

Hex encoding it is safer, at the expense of readability for
the administrator. This error case should be rare though so
sacrificing readability is not too expensive.

toHex will be moved to a better place
when we discover more usages for it.
@xela1601 xela1601 force-pushed the issue-156/dockerfile branch from 5a929c5 to e052c82 Compare May 27, 2025 17:40
@xela1601 xela1601 force-pushed the issue-156/dockerfile branch from 7dff829 to f497ccb Compare May 28, 2025 09:02
@WULCAN
Copy link
Collaborator

WULCAN commented May 29, 2025

Note that:

  1. 5a929c5 was pushed with 5206e0b as parent but later replaced by e052c82 with a force push.
  2. 7dff829 was pushed with c15aa47 as parent but then replaced by f497ccb with a force push.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My npm 10.9.2 cannot reproduce the package-lock.json of 607027a from the corresponding package.json. That doesn't mean its wrong, but I think we need to understand why it's different. What tool did you use and how?

@@ -21,7 +21,7 @@
"@mui/material-nextjs": "^5.16.8",
"@mui/x-tree-view": "^7.23.0",
"@octokit/rest": "^20.1.1",
"next": "14.2.15",
"next": "^14.2.29",
Copy link
Collaborator

@WULCAN WULCAN May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use two modules form the same project: next and eslint-config-next. The Next project expects them to be in sync.

"eslint-config-next": "14.2.15",

Actually, we also depend on a really fishy eslint-plugin-next 0.0.0 but let's handle that in a separate issue: #197


// 👇 make tracing start at the monorepo root
experimental: {
outputFileTracingRoot: path.join(__dirname, '../..'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

../.. from webapp is outside of our monoreposiroty root. The example from Next.js docs is for a package in packages/web-app. Our package is just in webapp

See also https://github.com/zetkin/lyra/blob/f497ccb752d3e2ac384b036647605285e4c67d88/README.md#standalone-build-monorepo


```bash
npm run build
node webapp/.next/standalone/lyra/webapp/server.js --check
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move --check to before the script, to make sure and clear that it's a flag to node, and not not to server.js


```bash
npm run build
node webapp/.next/standalone/lyra/webapp/server.js --check
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flag --check tells Node to check the script's syntax without executing it. Does this really help to catch missing dependencies?

Comment on lines +62 to 82
/*
check if directory at repoPath contains something.
if it does, we skip cloning
if it does not, we clone the repo there since
"Cloning into an existing directory is [...] allowed if the directory is empty." according to git docs:
https://git-scm.com/docs/git-clone#Documentation/git-clone.txt-emltdirectorygtem
*/
if (repoFolderExists) {
const content = fs.readdirSync(spConfig.repoPath);
if (content.length !== 0) {
info(
`Repo folder at ${spConfig.repoPath} exists but is not empty, skipping clone.`,
);
return;
}
}

if (!repoFolderExists) {
info(`Cloning repo because it does not exist at ${spConfig.repoPath}`);
await RepoGit.clone(spConfig);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If repoFolderExists, !repoFolderExists is not and we will not clone due to the branching at line 79.

owner: project.owner,
projectPath: path.normalize(project.project_path),
repo: project.repo,
repoPath: path.resolve(paths.lyraProjectsAbsPath, project.repo),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider an administrator setting up Lyra as a translation management system for two projects A/X and B/X.

She creates a projects.yaml with the following content:

projects:
  - name: A
    base_branch: main
    project_path: .
    owner: A
    repo: X
    host: github.com
    github_token: << github token >>
  - name: B
    base_branch: main
    project_path: .
    owner: B
    repo: X
    host: github.com
    github_token: << github token >>

When Lyra parses this in serverConfig.ts, Lyra will create two ServerProjectConfig with the same repoPath: lyra-projects/X.

As Lyra is used, Lyra will create two instances of RepoGit with the same repoPath. Probably, Lyra will only clone one of them, and use that for both projects.

There might be other severe concurrency issues too, consider for example that simple-git's clone is async, or that different projects can have different base_branch. This might also be a problem for monorepos where both owner and repo are the same but project_path differs. Basically, the previous implementation with separate repo_path for each project was a lot safer and easier to reason about.

For this pull request, let's just add a warning in README to the administrator that repo must not have the same value for two different projects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dockerize it
4 participants