This work was produced by our AI Assist SEG (Single-Engineer Group)[0]. The engineer behind this feature recently uploaded a quick update about this work and other things they are working on to YouTube[1].
Why do you call this a group? Why don't you say "one of our engineers did this"? I read the linked article [1] and that seems to be the accurate situation: not a group of people including one engineer, not one engineer at a time on a rotation, but literally one person. In what way is that a group?
Great question. I'm not entirely clear on the origin of the name and it would probably be hard for me to find the folks behind this decision on Friday evening/Saturday so I'll share my interpretation.
At GitLab, we have a clearly defined organizational structure. Within that stucture, we have Product Groups[0] which are groups of people aligned around a specific category. The name "Single-Engineer Groups" reflects that this single engineer owns the category which they're focusing on.
I'll be sure to surface your question to the leader of our Incubation Engineering org. Thanks.
I imagine it's similar to working groups [0], but keeping the term the same when only one person is in the working group, hence single engineer (working) group. Basically, parse `group` as in `working group` as a single term of art, rather than literally meaning a colloquial group, ie 2 or more individuals.
I have nothing against a working group that happens to be one person at the moment, but this SEG is defined by being a single person. It would cease being a SEG if it got more members. Hence don't call it a group...
If it makes it easier, our title is Incubation Engineering, because it reflects better what we do: finding new areas to explore and kickstart, searching for new markets, or how to expand existing markets. We need to choose initiatives within our scope (mine is MLOps) that fit the scope of a person in a short period of time, and test the idea by putting stuff in production. Ideas that are too uncertain or risky to dedicate a full team on, but would make sense for a single person initially.
Other recent outputs of the team are CloudSeed[0], improved GitLab pages setup [1] and Code Reviews for Jupyter Notebooks [2]
OpenAPI, tests, and {Ruby on Rails w/ Script.aculo.us built-in back in the day, JS, and rewrite in {Rust, Go,}}? There's dokku-scheduler-kubernetes; and Gitea, a fork of Gogs, which is a github clone in Go; but Gitea doesn't do inbound email to Issues, Service Desk Issues, or (Drone,) CI with deploy to k8s revid.staging and production DNS domains w/ Ingress.
The definitions I see for group all use "a number",e.g. "a number of people or things that are located, gathered, or classed together". So if you're willing to accept 1 as a possible value for a number, it's not an oxymoron.
It is, at least in principle. As I see it, there's nothing that immediately discontinues a group when the last person leaves; it only makes it inactive. But a group could also become inactive in many other ways (e.g. deciding not to meet for a long time). It's only if whatever organization that found a benefit at having that group, no longer does and decides to discontinue it, then it stops being a group.
The world I live in is one which accepts both usual and unusual circumstances.
In the extreme case, consider a small company whose few owners and employees suddenly pass away (e.g. in a plane crash). The fact that the company suddenly has 0 employees and even 0 shareholders does not immediately terminate the company. It still continues to exist as a business and legal entity, until the government / lawyers either find new shareholders (typically heirs) who are willing to run the company, or appoint a director to initiate foreclosure.
John nailed it. A single engineer group is something being explored by one person, but could eventually represent a "product group" in the future. See https://about.gitlab.com/direction/#single-engineer-groups for more things that could be explored by such a person.
If you think there is a better name to capture that I'm sure it would be considered.
I’m always suspicious of “one engineer did this” projects. There are probably quite a few engineers, product people and other employees who are simply unattributed for this.
Well, on the FauxPilot side I did do the initial implementation personally (over the course of a few weeks this summer) – but of course I'm building on the work the Salesforce team put into training the CodeGen models, and the work NVIDIA did in creating the FasterTransformer inference library and Triton inference server that FauxPilot uses.
I don't know how things are organized within GitLab, but it's pretty plausible to me that one engineer could put together the completion portion of the VSCode extension – you can see the code here: https://gitlab.com/gitlab-org/gitlab-vscode-extension/-/blob...
Fred has also been providing pull requests to improve FauxPilot, which I very much appreciate! :)
Another SEG here (our title is Incubation Engineer now, my area is MLOps). It is true that we build on top of what others did. What we mean by single engineer group is that is a one-person-team: we are responsible from the end-to-end project, from defining the vision, to talking to customers, to implementing the solution, and delivering a product. This is a relative new team (a bit more that one year). https://about.gitlab.com/handbook/engineering/incubation/
Well, the guy that wrote this fibonacci function that the model reproduced verbatim from some "learn to program" site sure went unattributed. That's AI for you.
To try it out, I wrote a Hello World function, and it literally named the tutorial it was taken from. (ie: Hello, Tutorialname!) So it definitely happens often enough, although usually it is probably harder to tell when straight up plagiarism has occurred.
Open VSX allows anyone to upload an extension there with the same name and description from the VSCode marketplace, but silently change code or make new releases, maybe introducing misfeatures or malicious code. Users typically don't notice because they think they are installing "the same" extension from the VSCode marketplace.
> I never published a version v999.0! It seems like you are using the unofficial open vsx marketplace (where, apparently, anyone can upload anything). You can find an issue here in this repository about it.
> Unfortunately, someone uploaded the extension in that version which blocks any further updates with that name.
> For now I believe in Microsofts vision. I don't think a secondary marketplace is good for the community - It just causes confusions like this.
> If you setup a github action that automatically publishes this extension to open vsx, please open a PR! ;)
The established practice of having random individuals set up ad-hoc mirrors of VSCode extensions is a serious security issue.
If Open VSX wants to mirror VSCode extensions, that's okay - but they should do so with an automated process that mirrors ALL extensions and do not allow for random people to silently change the code of an extension with no clear indication to people installing it.
If, however, someone want to copy the code of an existing VSCode extension, change some things and upload it to Open VSX, that's super okay too (and in the spirit of open source), but please fork it and clearly indicate in the description that the extension is a fork, linking to the source code of the original extension. The currently situation is unacceptable.
And I want to add that it's Microsoft that is in the wrong here. Their policy of only allowing the usage of their package repository if you are using their proprietary build of VSCode is absurd. It's as if npm disallowed the use of their repositories by yarn and pnpm. We shouldn't tolerate this behavior, specially not from a company that claims to "love open source".
But, Open VSX could and should do more for people to verify the provenance of their packages. There are many ways to do this. Perhaps one way is to have two kinds of packages (readily apparent in the description): one automatically imported from the VSCode marketplace (and guaranteed to match the upstream package exactly), and another kind published specifically for Open VSX.
Right now it seems to be a better security practice to simply ignore the VSCode marketplace terms of use and use it anyway on open source builds (either Code - OSS or VS Codium), instead of using Open VSX. And that's a shitty situation to be in.
A planned feature is to implement INT8 and INT4 support for the CodeGen models, which would let the models run with much less VRAM (~2x smaller for INT8 and ~4x for INT4) :)
The 7GB runs great on a 3080Ti. I am getting a lot of 'ValueError: Max tokens + prompt length' errors with larger files. Can this Gitlab client also replace the vocab.bpe and tokenizer.json config like Copilots? Thanks for your work on Fauxpilot, really enjoying playing with it.
I believe right now the VSCode extension just passes along the entire file up to your cursor [1] rather than trying to figure out how much will fit into the context limit – it's definitely still very early stages :)
It would be pretty simple to run the contents through the tokenizer using e.g. this JS lib that wraps Huggingface Tokenizers [2] and then keep only the last (2048-requested_tokens) tokens in the prompt. If they don't get to it first I may try to throw this together soon.
> A planned feature is to implement INT8 and INT4 support for the CodeGen models, which would let the models run with much less VRAM (~2x smaller for INT8 and ~4x for INT4) :)
That would certainly put the larger models within reach of most users. Part of my reluctance to deploy this myself was I'd have to either redline the VRAM on a 8GB card or choose the smallest model.
It shouldn't require retraining, nope. I believe for INT4 there is a small adapter layer added that needs to be trained, but it is small and wouldn't require much data or computation to do so. Will know more once we actually start implementing :)
I don't have any to test on unfortunately! But it's a wiki; if you get some of the models running on those (it should be as easy as running ./setup.sh) please add a line saying that it works!
This seems to work ok on a 3080Ti, even under WSL2. Nice!
For larger files I'm getting errors for maximum token + prompt length - is there an easy way to tweak these limits for the client? I think the Copilot client needed some overrides for Fauxpilot for this, so hoping Gitlab client has that too.
I know there seem to be a few different models to choose from, does anyone have a resource that collects them and shows the strengths and weaknesses of the various options? I’d be particularly interested in models optimized for apple silicon.
Unfortunately, Copilot is a lot more capable. Most important is that it works with many more languages out of the box, is continuously updated, has more mathematical plus scientific knowledge and is better at understanding your comments.
As for which to use, you'd want to take the largest model that can you can fine-tune (not necessarily on your machine) and fits comfortably in your machine. The small models aren't as flexible and are basically just a cleverer autocomplete. Copilot is a lot smarter than that and smarter than all models I've tested up to 6B. It can often cleverly work out a lot of local patterns for what you're doing and this is true surprisingly often for novel languages, math, understanding graph notation, categorization policies and much, much more.
I'd gone in hoping I could remove my dependence on Copilot but left disappointed. The small models up to 6B just don't compare. I don't have the resources to run the 13B model which is probably behind anyways due to not having trained as long on as much as copilot (which is continuously updated). I also suspect Copilot has a great deal of hand-coded hacks and caching tricks to improve the user experience.
Yep, I definitely agree that the 6B and below models are worse than Copilot. The 16B ones are pretty good! But IMO still undertrained compared to Copilot, and of course much less accessible (though see elsewhere in this comment section; I think INT8 and even INT4 should be doable – this won't help much with inference latency but it should help most people fit the 16B model locally).
I have high hopes for the BigCode project too; they will get to take advantage of a lot of things that have been learned about how to train code models effectively.
One last note – I think that many of the improvements to Copilot since its release are attributable to "prompt engineering" and client-side smarts about what context should be included; I'm not certain, but I can believe the underlying Codex model used hasn't changed much (if at all) since code-davinci-002.
Nope. The way it works is FasterTransformer splits the model across the two GPUs and runs both halves in parallel. It periodically has to sync the results from each half, so it will go faster if you have a high-bandwidth link between the GPUs like NVLink, but it will work just fine even if they have to communicate over PCIe peer-to-peer or even communicating via the CPU.
This work was produced by our AI Assist SEG (Single-Engineer Group)[0]. The engineer behind this feature recently uploaded a quick update about this work and other things they are working on to YouTube[1].
[0] - https://about.gitlab.com/handbook/engineering/incubation/ai-...
[1] - https://www.youtube.com/watch?v=VYl0dg8xyeE