Just 15 minutes + questions, we focus on topics about using and developing nf-core
pipelines. These are recorded and made available at https://nf-co.re
, helping to build an archive of training material. Got an idea for a talk? Let us know on the #bytesize
Slack channel!
Bytesize is back! We start the autumn session with a talk to aid anyone participating in the nextflow and nf-core training. Marcel (@mribeirodantas) is going to show how to setup and use a gitpod environment. This will be used in the training event starting on the 6th, so definitively worth a watch if you intend to participate!
Video transcription
:::note The content has been edited to make it reader-friendly :::0:01 (host) Hello, everyone, and welcome to the first bytesize in fall. I’m happy to have with me today, Marcel. He’s from Seqera Labs and is telling us today how to set up a Gitpod environment, which might be specifically of interest for people who will join the training tomorrow. Off to you.
0:23 Hello, everyone. It’s great to be here again. Gitpod is a very nice product or service, and we’ve been using it to help people have some hands-on, as-fast-as-possible access to the training material that we have in Nextflow. Some people have some difficulties. They try to look for up-to-date material on how to use Gitpod or how to use Gitpod to access your trainings, the Nextflow community trainings. We decided to provide this bytesize talk so that people can have an up-to-date video tutorial on how to do it and also to get into more details about how Gitpod works, which is something that we don’t do during this training because that’s not really the focus.
1:11 I’m going to share my screen and show a bit Gitpod. That’s the official website, Gitpod.io. The best way is not the training material, but here to find more details about the pricing and everything else, because it has changed reasonably often in recent months and years. Even though in the training we try to have the most up to date information about how it works, ideally you should come here and check how it works. Because sometimes, even the management of the hours used, they are different. The first thing to do is to sign up. I already have an account. What I did was to compy screens of the process. When you go and click on sign up or sign in, you’re going to see the screen, which means you don’t really create an account for Gitpod, but you use another account that you already have. In the past, you could only log in with GitHub, but now you can also log in with Bitbucket and GitLab. Once you choose one of these options, you’re going to see a window similar to this one. This one is for GitHub, asking you to authorize Gitpod to access your account and so on. Once you click to authorize that, you will get this window asking, okay, so you authorized that, so it synced your GitHub account or Bitbucket account or anything like this with Gitpod. You’re going to choose a name, and this is the thing that’s relatively recent. You can connect your LinkedIn account with your Gitpod account, and then you get 50 hours of usage per month. If you don’t do that, you only get 10 hours per month. It’s a bit annoying, but this is more than enough. The 10 hours is more than enough to follow the training, the foundational training. If you want to still use Gitpod often afterwards, then it may not be enough, and you may have to pay, or to connect your LinkedIn account to get the 50 hours per month. But it’s not only that. When you’re opening, when you’re configuring the version of the VS Code, because in the end, Gitpod is going to provide you a VS Code web version in your browser. Here you can pick what’s the default version that you want, the theme, and then it’s going to ask also some information about you, where you work at and what you’re using Gitpod for and so on. In the end, you get the screen which has all these workspaces.
3:44
What are workspaces? What Gitpod provides to you in the end is not only this web URL or VS Code, but also a virtual machine. You point to Git repository URLs, like GitHub repositories, that have some Gitpod virtual machine prepared for that. Then when you access that, you have one workspace. It may have multiple workspaces for the same repository, so they’re all going to appear here. The thing is, when you decide to open one of them, and I’m going to show here one example, so we can go to GitHub, Nextflow, training. When you go to the repository, and we know that the training material, we have a Gitpod for that, you just get the URL, and you add gitpod.io/#
, and the address of the GitHub repository. Then you’re going to have this new workspace window. The detail here that I want to emphasize for you is that here you have some options of what machines you want to use to run this container image that we built for the training material. You have the standard one and the large one. All of those 10 hours or 50 hours that we saw a few minutes ago, you’re actually using the standard machine. If you choose the large machine, which has more memory, more storage, and more CPU cores, it’s going to be less than 10 hours and less than 50 hours if you connect to your LinkedIn account. It’s much better to use this large machine, but be aware that the time is going to be shorter than 10 or 50 hours. I usually use the standard one, and it’s pretty fine. You’re going to click continue, and then you’re going to see this screen loading and building the workspace for you. Preparing, starting. The first time you do that, it’s going to take a bit longer. It’s pulling the container image, it’s going to install some software, it’s going to do a few things, and in the end, it’s going to provide to you this window. It takes a while for everything to finish because there’s more that we want to show to you. Okay, so that’s ready now.
6:21 You can close the debug console here, click on here, and you have access to your terminal, which shows everything you see on the left. We are in the NF-training folder. This is the folder for the hands-on training, which is another training in this platform. We have it here, hands-on training. You’re going just compress it here. This is the folder for the foundational training. This one here, the basic Nextflow training workshop. We can see here with the ls output, you’re indeed inside this folder here. Once we are inside, then there’s all these features that if you already use VS Code, it’s all the same. We have the file explorer here. You can click on any of these files to open them. You can close them with this X at the top. If you want to download some file, you can click with the right button or depending how your configure it in your machine, and you have the option here to download the file. It’s going to download the file to your computer. You can rename files, you can delete files, you can copy path, you can do lots of things, but all these features are pretty similar to what you have in VS Code, just like these other ones here.
7:27
We have this computer image that’s already pulled, which will be used during the training material for the rnaseq pipeline. We can open a browser here. Some are already installed. You can decide to install others. We have some search control. Anyway, it’s VS Code, right? You can create folders, files. There are many things. You could spend a whole day here showing you how this VS Code works. Everything you can do by this last panel here with the right button and so on, but you can also use the terminal. If I want to create a new file named example.nf
, I can type code example.nf
, and it will open a new window here for you to type something. You can just save that, and it will appear here, example.nf
, just like you would do in your machine, typing code for VS Code. Once you do that, you have access to all this amazing environment. It’s not your machine. Maybe I don’t have Nextflow installed in my laptop, but here I have Nextflow, right? Maybe I don’t have Docker installed in my machine, but here I have Docker, right? So, everything is already installed here so that when you’re following the training material, which is here, right, when you’re following this training material, even if you don’t know how to install or maybe you cannot install software in your machine for some reason, or you don’t want to mess up the configuration of your computer, you don’t have to do anything in your computer. You just go to the Gitpod workspace, this virtual machine, and do everything there because everything is already installed. For example, in the Simple RNAseq workflow that we’re going to build at some point in the training, you can just run nextflow run script1.nf
. You don’t have that in your machine, yet, but it’s here, right? So, I can just do nextflow run script1.nf
. It’s going to work. It’s going to do whatever it says here, showing something, like printing the reads back, right? So, Gitpod allows us to very quickly start practicing Nextflow and to very quickly be able to follow the training material that we have with the recordings and everything that you will soon see.
10:03 Another interesting thing is to set up your Gitpod instance for your personal GitHub repository. The focus of today is really to show how to use Gitpod for the training material GitHub repository, but it’s also interesting to show you what’s going on in the background. The first thing is that we have a container image that we created. Here’s the training GitHub repository. We can go to Github and we have this Gitpod Dockerfile. This is just the Dockerfile we use with the GitHub action to build this container image. It installs lots of software that are required for the training material, including Conda and Mamba and Nextflow and nf-core tools and a few things, right? That’s some configuration. This is the GitHub, the container image. You could do it locally in your machine and push to a container registry, but we have a GitHub action doing that. The interesting thing, though, is that .gitpod.yml file is at the root of your GitHub repository. Here is where you have all the magic happening. You have some workspace location, some information about what you want your VS code to look like and to have and so on. The checkout location in the virtual machine, right? Within a container, of course, you have some interesting things here. Enable for the master default branch, enable for all branches in this repository. Maybe you want to open this GitHub repository with Gitpod, but you want a specific branch, right? No, you don’t want master or main. We have lots of things here. Then you have the container image where you will be inside when you open the Gitpod workspace. Here we have the container image that I just showed the Dockerfile to you, right? And a few things, a few tasks, and here’s some VS code extension that we want by default, right? So, this .gitpod.yml is where most of the magic happens.
12:05 But then we also have the other file that we mentioned at the top of the previous file, which is gitpod-ws.code-workspace. Here, for example, I show what folders I want to be automatically opened in the Power Explorer on the left, as we saw a few minutes ago. Then a few settings also. All that you can find in the Gitpod website. You have the resources and docs and so on. Everything you can find here if you want to have a Gitpod instance for your GitHub repository, right? When you do the Gitpod.io/# in the GitHub URL, Gitpod will look for these files, mostly for the .gitpod.yml that we saw here. The other one that it prefers is gitpod-ws. I don’t know if there’s much more in that talk about it. It’s simple, but I agree that without a video tutorial like we did here, it can be challenging to start using gitpod all of a sudden when you want to do the next full training. But we believe that with this bytesize talk, we can have some step-by-step video training like I showed with the print screens of every screen you have when you sign up. But indeed, we have this connect with LinkedIn detail now. It wasn’t like that in the past. Some people were trying to follow our courses or training materials and they were like, oh, there’s something different here. Why do I have my LinkedIn account and so on? So, I believe that right now if you start to connect with LinkedIn, maybe if you ask for a phone number to confirm. I’m not sure. Someone told me that, but I already have an account, so I can’t really be sure. Yeah, I think that’s it.
14:02 If you have any questions, I can answer or maybe focus on something specific that at first seemed not so important to me. But maybe for you, you want to know something else about this gitpod or the training material and so on. Back to you Fran to manage these questions.
14:22 (host) Thank you very much. Are there any questions from the audience? Maybe I can start with one question from my side first.
(question) There seems to be quite a few files that you have to create in order to get this to work. Is there any easy way where there’s like a builder or a wizard that guides you through creating one? Do you know about that?
(answer) Yeah. These files are really if you want to create a Gitpod workspace for your personal project, right? For the training material, everything is done already. They have a command line called GP and they have a command. I think it’s gp build
or something that you run it and it builds the .gitpod.yml for you. In the Gitpod, you go to resources and docs and there are multiple files explaining how to do that. How to build this file, how to test the other ones, how to install extensions, it’s all there. It’s very useful if you want to maybe have a workspace to play with your Nextflow pipeline, for example, that you are developing. But I want to emphasize that for the workspaces that are already there, like the training material, you don’t have to do anything. The only thing you have to do is to sign up for an account and go to the URL gitpod.io/# and the GitHub repository URL of the training material. That’s all you have to do. Then the signing up, of course, and waiting for it to load with the page and the builder and so on.
15:55 (question) We have a question in the chat. Cohen is asking, what is the main advantage over VSCodeplus.dev container?
(answer) That’s a hard question, actually, because until a while ago, the obvious difference is that the GitHub code space, they were more expensive, right? So, I’m sorry, I’m confusing VS Code with the code spaces with GitHub. That’s a good question. I’m not sure if I remember correctly, I used dev containers, but I think I also use it with code spaces. The thing is, until a while ago, Gitpod was better than the other solutions because it was cheaper. Everyone was more limited, less powerful machines and more expensive than Gitpod. But then recently, things started to change. With GitHub code spaces, for example, they made it, I think, as cheap or maybe cheaper because I think they also provide maybe 55 or 60 hours. I remember when they released the latest change, it was something like, we give you for free the same amount of Gitpod or maybe even a bit more. This was like a killing blow almost, I don’t know. Gitpod is still there, of course, and we love it and we use it. But it was a very, I don’t know how to say, it was a very strategic move from GitHub. Because now, maybe, a lot of people will use GitHub code spaces, people that before used Gitpod. We even have some GitHub code spaces for, I think, an nf-core documentation if I’m not mistaken. But we’re staying with Gitpod because we already have everything working, it’s easy and the amount of hours they provide to us is enough for the training material. I can’t say it’s the best, but it’s been working pretty good for us.
(host) Okay, to come back to the question, there may not be an advantage, but it’s working pretty well for our purposes.
(speaker) yes
(host) Phil is also giving some comments. By the way, anyone could also unmute themselves so you don’t have to write that long. He says that all nf-core pipelines ship with configs and containers, and it’s part of the template, so no need to create them for those repos. But repos without configs will just work, it’s only if you want to have fancy stuff with the configs. He also says that dev containers is code spaces, but can also run locally, and he’s not quite sure. Cohen could confirm that. But one thing is Gitpod is Linux, so if you’re on a Mac, it can help, and Gitpod is totally disposable.
(speaker) This is something that I do a lot, actually. For example, I have a Macbook, right, so I’m on an Apple Silicon architecture here. Sometimes when I want to run some pipelines, some things, they go high-wire because the container, the tools used are containers for Linux, and I have MacOS, which means that my Docker is actually running a virtual machine with Linux, and inside running my container. My operating system is emulating all these instructions to the Apple Silicon architecture, so it’s a lot of emulation. Using Docker on Mac, if you use MacOS, you know, using Docker on MacOS is not really straightforward, in the sense that things don’t work as expected for no apparent reason, sometimes it gets stuck, sometimes it gets longer, sometimes you have weird error. Whenever I’m developing pipelines with Linux containers and so on, or I want to test something, I go to Gitpod, because I get very weird mistakes on MacOS. Just like you said, it’s a great place if you want to have a Linux box, a Linux machine to play with, and to run a pipeline to compute something, go to Gitpod. It’s very, very nice, and it’s disposable, right? When you’re done, you just close the workspace, and that’s it.
20:10 (speaker) One thing that Simon Pierce also replied in the chat, and I think it’s very important to emphasize, is that Gitpod is entirely web-based. The same way you don’t have to have Nextflow installed, or Docker, you don’t have to have the VS-code installed either. You can have just bought this machine, nothing is installed, just the web browser, this is enough. You just open Gitpod and everything will be there for you.
(host) We just got also the confirmation that dev container is locally, or that it’s at least possible.
20:41 (host) Okay, great. Do we have any other questions from the audience? It seems everything is clear, then I would like to thank you, Marcel, for this nice talk. As usual, I also would like to thank the Chan Zuckerberg Initiative for funding our talks, and for the audience to listen, and for this nice discussion. Thank you very much.
(speaker) Bye everyone, have a nice day.