This website uses cookies

Read our Privacy policy and Terms of use for more information.

Who am I?

I'm Tom — Dad. Designer. Digital product tinker

Full-time at Algolia, two kids under 5, south coast of NSW. 10+ years shipping design and dev across startups and enterprise.

I have over 10 years experience years experience in tech delivering scalable design and development solutions from early-stage startups to enterprise companies.

Since my kids hit the scene, I retired from partying and now spend my free time consulting and building digital products.

I've got roughly 5 hours a week for side projects. B55 is the weekly log of what actually happens inside that budget. What ships. What doesn't. Insights and the uncomfortable ratio between building (easy) and selling (hard).

Come join me!

What I Shipped

The plan is to send these out weekly but It has taken me some time to put it all together so this first issue is over a few weeks, Still - making good progress!

  • Solidified my initial offering to test AI consulting and built out the MVP site aidoneright.com.au in one evening using Claude Code & Cursor.

  • Sent 5 warm outreach messages. 4 out of 5 replied, 3 calls booked 2 down and 1 to go (no paid closes yet).

  • I built antlytics to avoid having to pay for detailed vercel analytics and as my first foray into claude code. P.S. am lookign for beta testers, let me know if you are interested!

  • It is early days, in this sprint I added multi-site dashboard switching, and new help docs. All built with Claude Code and Cursor. A year ago this would have been days of work with my current schedule.

This newsletter — finally. Named it. Set up Beehiiv. Spent 3 hours choosing between the name 55built, founder55, build15, and b55. Setting up the design on Beehiiv was actually faster 😅

Whoops… Missed finalizing the template and shipping issue 1 of https://thecreatureclub.co/

One thing I learnt

A colleague mentioned that Opus 4.7 has some powerful design replication capabilities and thought I would give it a whirl.

I Took a polished Dribbble design (thanks digi.co) and tried to replicate it as a working Next.js app using Opus 4.7 inside Cursor. Under 5 minutes, couple of prompts. The result honestly impressed me.

Couple of things I noticed:

  • Opus 4.7 requires explicit prompting to get the best out of it - it is considered the first model that will actually punish you or burn through your tokens if you do not give it enough context or are lazy with your prompting.

  • From the first pass it was exceedingly good with only a few minor touch ups to make my inner designer happy as a first pass.

  • Mobile was decent first pass. Gallery auto-converted to a horizontal slider, grids stacked properly. Couple of overlap issues but nothing major.

  • Th e only place it stumbled was on a particular Unsplash image where the URL came back 404 which broke a section, I had to manually fix it in the end as it became stuck in a loop.

For rapid prototyping or showing a client a design direction, this is legit. Not pixel-perfect but gets you 85-90% there in minutes instead of hours. I recorded the whole thing and put it up on YouTube if you want to see it in action.

What caught my eye


There were rumours of users being disgruntled about Claude results and it was revealed they were not entirely unfounded…

Anthropic traced user reports of Claude Code degradation to three separate bugs, all now fixed as of April 20:
  1. Reasoning effort downgrade (March 4): They quietly switched Claude Code's default reasoning from "high" to "medium" to fix latency/freezing issues. Users noticed it felt dumber. Reverted April 7.

  2. Thinking history bug (March 26): A caching optimization meant to clear old reasoning from stale sessions had a bug — instead of clearing thinking once, it kept stripping reasoning on every subsequent turn for the rest of the session. This made Claude forgetful, repetitive, and burned through usage limits faster. Fixed April 10.

  3. Verbosity prompt change (April 16): They added a system prompt instruction telling Claude to keep responses under 25 words between tool calls. Internal evals didn't catch it, but it hurt coding quality. Reverted April 20.

Opus 4.7 is more literal and precise. It follows instructions exactly as written rather than generalizing or inferring intent. 4.6 was more likely to "read between the lines." So you need to be more explicit about scope (e.g., "apply this to every section, not just the first").

4.7 is more opinionated and less verbose in its default tone.

4.7 reasons more, uses tools less. It leans toward thinking things through internally rather than reaching for tools. 4.6 was the opposite, it would aggressively explore and use tools. You can push 4.7 toward more tool use by raising the effort level or prompting explicitly.

4.7 spawns fewer subagents by default (4.6 had a tendency to over-delegate). Both are steerable with prompting.

4.7 is better at bug-finding but it also takes filtering instructions more seriously i.e. if your prompt says "only flag important stuff," it might suppress real findings that 4.6 would have surfaced anyway.

4.7 has stronger frontend design instincts but comes with a persistent default aesthetic (cream/serif/terracotta) that takes concrete specs to override. 4.6 needs more prompting but has the same default aesthetic.

New effort level: 4.7 introduces xhigh (recommended for coding/agentic work). The effort parameter matters more for 4.7 than any prior model.

Basically, 4.7 is the more capable and literal model. it rewards well-crafted prompts and punishes vague ones. 4.6 was more forgiving but also more prone to overdoing things. I can’t help but think there is some level of milking higher token usage from users here though.

Thanks for reading!

B55: building businesses on 5 hours a week.

Reply and tell me what you're building on limited time.

- Tom U

Reply

Avatar

or to participate

Keep Reading