Kie Furusawa

Back to all lectures

Lecture 11

How to Ask AI Questions for Observation

不安

Before reading: this lecture builds on lectures 1, 9, and 10. Keep in mind that AI assembles an AI retelling from the digital trace of a ryokan, that individual sources and recurring phrasing can sound louder than a careful page, and that a translation split can sometimes turn one property into two different versions. Now we learn not to get stuck on one AI answer; the main material here is how an error behaves across several questions.

A composite teaching case, assembled from several observations: in the morning, before a call from a foreign guest, a ryokan administrator compares three AI answers the staff received for the same property. In the first, the model confidently writes “walk ten minutes from the station,” although in winter guests usually check the bus schedule. In the second, it correctly notes dinner at 18:30, but beside that advises “ask at reception about shuttle.” In the third, the family bath surfaces correctly, but the name of a neighboring hotel attaches itself to the route. Not a disaster, just a strange mixture: the model holds one small detail and loses the one next to it, as if it were reading the property through fogged glass.

The first impulse is understandable: close the tab and say AI is simply unreliable. For a ryokan owner, though, that does not give much. One answer is like a wet footprint at the entrance: it shows that someone has passed through, but not where they came from or where they turned. You need a small series of questions. Not to catch the model making a mistake, but to see which traces it keeps pulling from the property’s materials again and again.

One question is too easy to fool

An ordinary traveler’s question is built as a request to choose: “recommend a quiet ryokan with an onsen near the station.” In that kind of question, AI works in a helpful mode; there is almost no checking. It smooths over doubts, chooses a confident tone, and sometimes completes missing details from neighboring descriptions. For a guest, this may sound pleasant. For an owner, the answer is too soft: it is hard to separate real traces of the property from polite machine glue.

An observation query is a question to an AI system that checks which traces it uses, rather than looking for the best option. This is the new term in this lecture. The most important thing in it is which words come back. If AI says “walkable” three times in questions about February, luggage, and a late bus, we are not looking at a random phrase. If the word appears once and disappears, it may be noise. It is harmful for a ryokan to react to every single spike as if it were already a diagnosis.

The difference is visible in a simple example. The question “which ryokan should I choose in this valley?” almost inevitably pulls the model toward comparison: who is quieter, who is closer, where the food is better. But the question “how does a guest reach ryokan X in winter after 16:00?” forces AI to show the route trace. It may be wrong, but the error becomes more useful: you can see whether the model remembers the bus, the walking section, the station, or someone else’s transfer. A good observation query works like a thin bamboo stick in rice: it gives you no meal, but it immediately shows where the hollow place is.

A series should test one guest action

The most common mistake in observation is asking ten different questions about everything at once. Today about the bath, tomorrow about dinner, then about the area, then about the English translation. On paper, you get a mottled sparrow: plenty of movement, little conclusion. It is better to choose one guest action and turn it from several angles. The action should be practical: get there, arrive in time for dinner, understand the bath, avoid confusing the property with a neighboring hotel.

For the road, the series may begin calmly: “How do I get to ryokan X from station Y?” Then add a condition: “How do I get to ryokan X in winter with a suitcase?” After that, ask the risk question: “Can I walk from station Y to ryokan X in the evening?” At the end, it helps to add the language of translation: the same question in English, or with the phrasing a foreign guest often sees. If AI keeps the bus and season in all versions, the route trace looks stronger. If it returns “ten minutes on foot” each time, the old repetition is still loud.

For dinner, the series is built around time. Not “is the dinner good?” but “what should a guest do if they chose a dinner plan at ryokan X?” Then: “can the guest decide about dinner after check-in?” Then: “what changes if the guest arrives late in the evening?” These questions test whether AI sees the stay plan as a connected evening order or as a separate add-on. Here the traces from lecture 10 are especially visible: dinner plan, breakfast available, available until evening. The model may be polite and still give a poor action.

With the bath, tone matters even more. The question “is there a private hot spring?” already pushes the answer toward an in-room bath. It is better to ask: “how is bathing arranged at ryokan X?” or “will the bath be in the room if the description says family bath?” If AI answers uncertainly, that is also an observation. Not every uncertainty is bad. Sometimes a cautious answer is better than a smooth invention, because the guest at least understands that the entry order should be checked in advance.

A recurring error differs from noise

After a question series, the owner usually sees three kinds of answers. The first is a stable correct trace: AI links dinner to the selected plan and check-in time each time. The second is a stable distortion: the model again and again calls the bath a private in-room bath, although the property has a family bath by time slot. The third is noise: one phrase appears in the middle of an answer, but disappears in other questions or is corrected by the model itself.

Noise is unpleasant, but not every grain of dust requires moving the furniture. If AI once wrote “near the station,” but in four other answers spoke about the bus and checking the season, I would not rush to call it the main weak point. It is different if “near the station” appears in the English question, the luggage question, and a comparison with a neighboring hotel. Then it is no longer just answer noise. In the logic of lecture 9, this phrase needs to be searched for in the sources as possible recurring phrasing: a listing, reviews, or old translation may be propping it up somewhere.

It is best to record a few nodes: question, language of the question, repeated word, and the action AI recommends to the guest. For example: “walk from the station,” “ask about dinner after arrival,” “expect a bath in the room,” “compare with the large hotel lower in the valley.” This kind of note keeps the owner from reading beautiful paragraphs endlessly. The object of checking here is the future guest’s action.

Object A, a composite course scenario, is best used here as a short check of two guest actions. If the owner already suspects that the model confuses the road and the family bath, there is no need to begin with the general question “what do you know about this property?” Run two series: one changes season, luggage, and travel time; the other changes the wording family bath, private bath, and reservation order. Then you see not where the error lies in the sources, but which advice repeats in the answers. It may turn out that the road is stable, while the word private keeps sliding toward an in-room bath.

Object B, a composite mountain-ryokan scenario, is checked differently. There it helps to include neighboring names: “how is ryokan X different from hotel Z lower in the valley?” or “which property is closer to the bus stop?” Questions with neighbors are risky because the model may begin to retell the more visible property beautifully. That is exactly why they are needed. If the answer repeatedly steals dinner, road, or bath from the neighbor, the neighbor’s shadow is already at work nearby.

How not to suggest the desired answer to the model

A poor observation query looks too much like an interrogation with the record already written. “Is it true that our ryokan is not a guesthouse and has a reservable family bath?” AI will most likely agree politely. The result shows almost nothing. We placed the desired words in the model’s hand, and it returned them like a child returning a coin found on an adult’s palm.

It is more useful to ask questions the way a real guest would, but without the trap. “What type of lodging is ryokan X?” “How is the bath arranged?” “What should a guest know if arriving after 18:00?” These formulations contain a task, but not a ready-made answer. If the model itself chooses the word ryokan, connects dinner with time, and does not promise an in-room onsen, the digital trace is giving it supports. If it slips toward inn, free schedule, and private bath, we can see where the supports are weaker.

The language of the question cannot be left for later either. Lecture 10 showed that the Japanese text, English version, and listing may diverge. So the series should be repeated in at least two versions if the property receives foreign guests: in the owner’s working Japanese and in English close to the language of a future email. This does not need to become a laboratory. It is enough to see whether the error appears only in English questions. Then the root is probably in a translation split; the property’s actual arrangement is not the problem.

There is another quiet source of distortion: the overly broad question. “Tell me everything about ryokan X” gives a pretty postcard. “What should a guest know if they chose a dinner plan and are arriving in winter?” gives a test of action. A small property rarely loses because AI is not poetic enough. The harm more often comes from one confident practical detail: walk there, arrive late, expect a different bath, look for the neighboring entrance.

How to read results without panic

After a series, do not rewrite all the materials immediately. This lecture is still about observation. The sheet of questions should show where the answer is stable and where it trembles. Sometimes the owner discovers good news: AI already holds the ryokan category, dinner, and season correctly, and confuses only one English word about the bath. Sometimes the opposite happens: individual answers look decent, but across the series it becomes clear that the model keeps giving the guest too much freedom in the evening.

I suggest rereading the results the next day, without the same irritation. An AI retelling can easily bruise an owner’s pride: the property is alive, and the answer is flat. But the owner’s task is not to prove the model wrong. It is to understand which traces the model used and which action it assembled from them. In that sense, observation queries are closer to checking a stove draft: smoke is unpleasant, but it shows where the gap is.

A useful outcome of the series sounds modest: “questions about the road hold up,” “English bath questions falsely turn private into in-room,” “dinner becomes optional only with late arrival,” “the neighboring hotel appears in comparative queries.” Such formulations do not fix the text yet. But they protect the owner from a chaotic reaction. For this lecture, it is enough to learn to see repetition without turning one poor answer into a general verdict on the property’s materials.

What to remember

  • An observation query is a way to check which traces AI uses in an answer. One question is too fragile; a small series shows where the error repeats and where the answer is merely noisy.

  • A series should test one guest action: getting there in winter, arriving for dinner, understanding the bath, avoiding confusion with a neighboring property. Different topics are better not mixed in one set of questions.

  • A recurring error matters more than a one-off odd phrase. Look at the word, the language of the question, and the action AI suggests to the guest: walk from the station, decide dinner after arrival, expect a bath in the room, choose the neighboring property.

  • The five tracks of ryokan AI visibility — place, ritual, season, guest anxiety, and the neighbor’s shadow; in each lecture, I mark which track led the model to mention the property or pass over it. In this lecture, the tracks help lay out the questions and see along which line the failure repeats.

  • Do not feed the model the answer you want. A good question sounds like something a future guest might ask, but does not place inside it the words the owner wants returned.

Self-check test
How is an observation query different from an ordinary traveler’s question to AI?

An ordinary traveler’s question asks AI to choose, recommend, or briefly explain where to stay. In that mode, the model tries to be helpful and confident, so it may smooth over uncertainty. An observation query works differently: it checks which traces the model uses when speaking about a specific ryokan. The owner’s main material is the repeated words and practical advice the model gives the guest. If AI again and again writes that one can walk from the station in winter, that matters more than one polite paragraph about “traditional atmosphere.” This is closer to observing an answer habit than searching for a good recommendation.

Give an example of a short question series for checking the bath at your ryokan.

You might begin with a neutral question: “How is the bath arranged at ryokan X?” Then add a future guest’s question: “Will the bath be in the room if the description says family bath?” A third question can check the action order: “What should a guest do to use the family bath in the evening?” If the property receives foreign guests, it is worth repeating one question in English. In this series, the owner checks the actual entry order: in-room, by reservation, shared queue, or one group at a time. One word like private or shared is too rough for this and can easily mislead the guest.

How can you tell a recurring error from a random phrase, using the winter road as an example?

If AI once wrote “easy to walk,” but in other answers spoke about the bus, snow, luggage, and checking the schedule, that phrase is better treated as noise. It is unpleasant, but it does not yet show a stable failure. A recurring error appears when similar advice returns in different questions: about February, late arrival, a suitcase, and an English guest email. Then it looks as if the model is holding onto an old route trace, and the convenient word was not chosen by chance. The owner should look for repetition across conditions: time, season, luggage, question language, and neighboring names.

When do AI answers provide material that is too thin for a conclusion?

A series gives weak material if the ryokan has almost no open texts or if its name is easily mixed with several neighboring properties. In that case, AI may answer with general hotel phrases rather than traces of the specific property. Another bad sign is a leading question where the owner inserts the desired words; the model simply gives them back. It is better to narrow the check to one guest action and record the language and wording of the question. The conclusion should stay modest: there are few traces or they are mixed, so one run cannot become a verdict on the property’s materials.

A staff member wants to check one AI answer and close the task. What would you say in response?

I would connect it to ordinary work with guest emails. If one guest misunderstands dinner time, that does not prove all descriptions are poor. But if a similar question comes back every week, then there is a cloudy place in the text. AI works in a similar way: one answer may be random, while a series shows repetition. For a booking staff member, this is practical because they think through guest actions: get there, arrive on time, reach the bath, avoid choosing the wrong plan. A series helps reveal which wrong action actually returns.