>We inspected StarCoder-generated programs on these benchmarks and found that there were several cases where the model produces what are effectively empty solutions, e.g., pass or a comment Insert code here. We also observed this kind of failure in every model we evaluated.
I'm not sure whether the AI learning that it can just write "#TODO" is a sign our jobs are safe or a sign our jobs are truly in danger.
Could be a sign the thing knows how to break work into multiple pieces. If it wasn’t just 1-pass and you give it a couple turns to document / test / deliver, it definitely can fill in placeholders from the initial generative step when it does refinement. Language chains, not instant zero shot perfection
I'm not sure whether the AI learning that it can just write "#TODO" is a sign our jobs are safe or a sign our jobs are truly in danger.