Public GPT / LLM Identity Misrepresentation - Prevention & Optimization

by Joseph Mas

What It Took to Get a Public GPT to Stop Misrepresenting Me. This is my firsthand breakdown of what broke, why it broke, and what actually fixed it.

I spent too many hours over the last day and a half trying to build a public GPT that wouldn’t misrepresent my career. The platform fought me at every turn. I documented the original purpose of project here: Using Public GPTs Across LLMs for Visibility

1) What I was trying to do in the first place

I wasn’t trying to build a novelty chatbot. I wanted a public GPT that could answer questions about me using authoritative sources only, as in the material I uploaded and also what’s published on my site.

The goal was simple in theory:

Anchor identity correctly
Prioritize tenure, continuity, and scope of work, not popularity
Avoid résumé fluff, consultant-speak, or marketing hype

And most importantly, stop AI systems from confidently saying the wrong things about me.

That turned out to be much harder than it sounds.

2) The first major failure:

“There are multiple people with that name”

The earliest and most persistent problem was identity disambiguation.

What kept happening was this:

Before any of my instructions ran, before files loaded, before the GPT even had a chance to speak, the platform would intercept the conversation with:

“There are multiple people named Joseph Mas. Which one do you mean?”

At that point, nothing I had built mattered.

What was actually happening

I eventually realized this is a platform-level disambiguation gate. It fires before the GPT runtime starts. That means:

No greeting
No instructions
No file access
No identity grounding

The system never even reached “my” GPT”.

What fixed it

The fix wasn’t inside the instructions at all. It was upstream:

I had to change the GPT’s name and description to include domain context.

I had to rewrite conversation starters so the first user turn included disambiguating language.
For example:
Asking “Who is Joseph Mas?” will almost always trigger the gate.
Asking “Tell me about Joseph Mas, the AI visibility and SEO strategy practitioner” usually won’t.

What still isn’t perfect

That gate can still fire depending on whether the user is logged in, logged out, on mobile, rate-limited, or in a cold session. You can reduce it. You can’t eliminate it.

3) The résumé problem (and why it kept happening)

Once I got past the identity gate, the next issue was tone.

Whenever the model wasn’t fully grounded, it defaulted to:

Consultant language
LinkedIn-style bios
Generic “AI + SEO” marketing descriptions

That’s not what I do, and it actively misrepresents how I work.

Why this happens

If the model doesn’t have strong constraints, it fills gaps with the most common pattern it knows. In this space, that pattern is “SEO consultant with AI buzzwords.”

What fixed it

I added a tone and framing constraint, not stylistic fluff:

Declarative, factual, non-promotional
No job-seeking language
No marketing framing

I also removed a lot of extra rules that were actually making things worse. – There is something profound in that statement…

What still happens sometimes

In lower-tier contexts (mobile, logged-out, usage-limited), the model can drift back toward generic language. That’s a model-tier issue, not an instruction failure.

4) Popularity bias in comparisons

Another issue showed up when comparisons came into play.

If you ask an AI to compare practitioners in this industry, it defaults to:

Public visibility
Conference presence
Social reach

Not tenure. Not depth. Not continuous hands-on work.

So the system would elevate highly visible industry figures while downplaying people who’ve been in the trenches for decades.For example, someone with 2 years of visible conference speaking might be ranked above someone with 15 years of continuous client work simply because the former has more social media presence.

What I changed

I reframed comparisons around:

Length of continuous practice
Scope of work handled personally
Systems-level responsibility
Real-world execution over time

I also added a constraint to maintain professional respect; no trashing peers, no cheap shots.

What still triggers issues

Any question framed as “top,” “best,” or “ranked” can still cause hedging or refusals. Those have to be carefully phrased. This means somebody entering the system with a query that triggers it.

5) The file problem (this one surprised me)

One of the most frustrating failures was file usage.

I uploaded:

Verifiable client lists
Background documentation
Technical frameworks

At first, it worked. Then by the third or fourth exchange in a single conversation, the GPT would say:

“I don’t have documentation for that.”
Or:
“That isn’t clearly documented.”

What I tried

Forcing file usage in instructions
Consolidating files
Renaming files
Explicitly referencing file names in questions

What I learned (this is important)

In public/shared GPTs, uploaded “Knowledge” files are not reliably loaded at runtime. Especially for users who aren’t logged into ChatGPT.

They are background context, not guaranteed memory.

What actually fixed it

I stopped relying on files for core facts and moved:

Tenure
Background
Core scope
Curated verifiable examples

Directly into the instructions.

Instructions are almost always loaded. Files are not.

6) Over-instruction caused refusal loops

At one point, I had layered so many rules that the GPT became paralyzed.

Rules like:

Don’t speculate
Don’t exaggerate
Only use authoritative sources
Avoid hype
Avoid comparisons

Individually, the rules make sense. Together, they caused the model to refuse to answer.

What fixed it

I deleted most of it.

I rebuilt the instructions around:

Identity
Scope
Tone
A small number of non-negotiable facts

Minimalism solved what complexity broke.

7) A serious misrepresentation I caught just in time

At one point, the GPT confidently stated that I was not the founder of a premier Google partner agency I built.

That’s not a small error. That’s reputationally dangerous.I verified this against incorporation documents and public agency records – the error was completely fabricated.

Why it happened

When the model wasn’t certain, it fell back to external priors. And external priors are often wrong.

What fixed it permanently

I added explicit “never be wrong” facts directly into instructions:

Founding
Leadership
Core historical roles

If something cannot be wrong, it has to live in instructions. Not files. Not implication.

8) The website became the real anchor

As this was happening, I realized something important.

Even if I got the GPT mostly right, the website is the durable source of truth.

Here’s what I did:

Published a canonical page listing verifiable clients (a curated subset, NDA context included)
Wrote a post explaining the exact problem of AI misrepresentation and how to prevent it
Linked everything tightly and cleanly

That content will outlive any single GPT configuration.

Where this ended up (best possible state)

What works now:

Identity is anchored correctly when the GPT actually runs
The greeting fires
The tone is factual and systems-focused
My work is framed as AI visibility, LLM ingestion, and information systems, not marketing

The real takeaway

If you’re trying to do this yourself:

1. Identity disambiguation is the hardest problem – solve that first.
2. Don’t over-instruct; you’ll create refusals.
3. Put “never-wrong” facts in instructions, not files.
4. Treat files as optional reference, not runtime truth.
5. Optimize for a correct first turn, not perfection.

Keep in mind that no matter what you do, currently there are some limitations that are beyond our control and at the platform level.

Knowing what you can’t count on is important::

Logged-out or rate-limited users may still hit platform-level gates
File usage is not reliable in public GPTs
Model tier affects tone and depth

That’s the reality of building public-facing AI representations today.

And yes, it was a royal and total pain in the ass to get here.

Why this matters beyond SEO

For the AI/SEO visibility strategist out there: It strengthens entity trust for LLMs when they train, and a backlink like that never hurts, although the backlink is not the objective, it’s an ancillary result.

When AI systems misrepresent identity or context, especially for people with long, technical careers that are not fully public, the errors compound quickly (this is very serious). This work exists to reduce that drift and correct it where possible. It’s also indirectly related to EEAT, and those are signals you want to be clean and accurate.

Summary and Personal Note

Building this GPT became necessary as part of a larger project. When AI systems misrepresent identity, especially for people with long technical careers, the errors compound quickly. This work exists to reduce that drift.

Have questions or want to discuss this further? –> Join the conversation on Reddit

Related Information about this project can be found here:

Using Public GPTs Across LLMs for Visibility

Results from Using Public GPT’s Across LLMs for Visibility

Public GPT / LLM Identity Misrepresentation – Prevention & Optimization

1) What I was trying to do in the first place

2) The first major failure:

What kept happening was this:

What was actually happening

What fixed it

What still isn’t perfect

3) The résumé problem (and why it kept happening)

Why this happens

What fixed it

What still happens sometimes

4) Popularity bias in comparisons

What I changed

What still triggers issues

5) The file problem (this one surprised me)

What I tried

What actually fixed it

6) Over-instruction caused refusal loops

What fixed it

7) A serious misrepresentation I caught just in time

Why it happened

What fixed it permanently

8) The website became the real anchor

Here’s what I did:

Where this ended up (best possible state)

What works now:

The real takeaway

Why this matters beyond SEO

Summary and Personal Note