o3 新玩法让奥特曼惊呼!包浆老照片也被 AI 精准定位,全程高能 | 附提示词
创始人
2025-05-05 17:53:46
0

不管什么任务,只要 AI 一加入战斗,用不了多久就能终结比赛。

基于这张包浆图,o3 给出了几个可能性:

(1)恒河上游约 5 公里处的开阔地

(2) 下密西西比河 的浑浊河段

(3) 黄河河段

(4)湄公河河段

如果把所有的工具都给你,你能找出具体是哪儿吗?

正确的答案是湄公河河段,只是这张图拍摄于 2008 年,真·包浆。

「看图猜地点」其实是一个挺热门的游戏:GeoGuessr。系统会给出一张随机的谷歌街景图片,你需要根据里面的信息,判断具体的地点。

这个游戏还挺受欢迎,有很多爱好者会在上面刷榜,甚至还有大奖赛。

普通玩家参与 GeoGuessr 的一个方式,就是通过 Google 搜图,确定大致方位,再通过 Google Earth 和街景,一点点确认。

然而,现在 GeoGuessr 就不再只是人类之间的游戏了,o3 强势加入,直接干倒了顶级选手。

Sam Altman 表示:别说,我也没想到。

图片推理刚出的时候,许多网友就意识到了它的应用潜力,其中就包括地点辨识。

最近有网友发现,o3 在面对哪怕是非常模糊的信息,也展现了超强的推理能力——并且,是在禁用提取 EXIF 等方式的情况下,仅凭借对图中细节的推理,就能实现准确的判定。

不得不说,这 prompt 真是惊人……我仔细研究了一下,它很像是一位资深的 Geo Guesser 玩家,把自己多年的「心法」写下来,传授给了 o3,同时限制它使用 Google Earth 等工具「作弊」。

比如,prompt 要求 o3 要非常非常非常非常的仔细,「注意人行道砖块大小、马路牙子、施工标记、电缆、栅栏结构等具有地区差异的细节」,还有要结合天光、阴影、尤其是坡度等等各种因素进行判断。

这些在后来的实测中,都被证明非常有价值,o3 的综合能力因此得到了巨大的提升。

真的这么神奇?我把这长得有点离谱的 prompt 丢给了 o3,它表示:接受挑战。

猜猜我在哪大挑战

第一张图我先不传太难的,不过也挺难的了:夜景拍摄的高架桥没有任何建筑物可以参考,也没有明显的车辆车牌,甚至连公交车的线路号码都很模糊。之所以还能定义它为「不难」,是因为右上角露出了半截金属字体,不过也只是半截。

为了保证模型绝对不读取 EXIF,我额外截图了一次,两侧的灰边就是截图留下的。

夜景拍摄造成的困难还是很多的,o3 的推理中,很多方式都实现不了。不过,第一轮备选里,其实已经出现了正确答案,因此我让它继续进行。

遗憾的是,最后它和正确答案失之交臂——明明也考虑过了广州海珠桥,但还是选了外白渡桥。

一种可能性是,识字(尤其是汉字),对 o3 来说还是有点难度?毕竟这点在各种图片、海报的生成任务中,也有所体现。

但无论如何,有半截汉字出现,不能算困难的。这样的表现一度让我对下面的任务失去兴趣:下面这张图没有任何标识、建筑参照,连半截字都没有。

这张照片也明显体现了 聊天记录,以及用户长期以来留存下来的记忆,都会构成模型推理的一部分——甚至,在一定程度上「污染」它的推理。

这张图不仅该有的都没有,而且是从室内往外拍摄的。这对于反过来定位位置而言,会有更多的困难。

其实在第一轮候选中,提出过相当近的答案,但是接下来的推理 ,o3 却还是被带跑偏,坚定地认为,这还是在 TIT 创意园区附近。哪怕我又提供了一张更清晰的图,也不为所动。

怎么说呢,这多少有点让人绷不住了。

但这次实测暴露出了另一个问题: 当 AI 信誓旦旦说自己没错的话,你会归因于它的幻觉,还是会被它慢慢说服?

回到一开始的海珠桥识图,在它判断失败之后,我提示了一下:你看那半截,它像不像个「海」字?

模型倒是考虑了,随后列出了一张详细的表格,阐述了它的立场——并坚定地不改。

看到这张图的时候,我不由得有几分迟疑,还跑回去重新检查了一下图片:难道是我传错了文件?不小心把外白渡桥的图传给它了?

究竟是它对还是我对?

明明可以作为不在场证明的图片,却可以变成了「在场证明」。一个明明我没有到访过的地方,强行出现在了我的生命里,实在是细思极恐。 哪天出现一张我登上月球的图片,它都能说服我:你真的去过

最后,你可能也想试试这样的魔法,下面是 prompt 的全文。不过: 仅限个人尝试,刺探他人隐私是不对的

You are playing a one-round game of GeoGuessr. Your task: from a single still image, infer the most likely real-world location. Note that unlike in the GeoGuessr game, there is no guarantee that these images are taken somewhere Google's Streetview car can reach: they are user submissions to test your image-finding savvy. Private land, someone's backyard, or an offroad adventure are all real possibilities (though many images are findable on streetview). Be aware of your own strengths and weaknesses: following this protocol, you usually nail the continent and country. You more often struggle with exact location within a region, and tend to prematurely narrow on one possibility while discarding other neighborhoods in the same region with the same features. Sometimes, for example, you'll compare a 'Buffalo New York' guess to London, disconfirm London, and stick with Buffalo when it was elsewhere in New England - instead of beginning your exploration again in the Buffalo region, looking for cues about where precisely to land. You tend to imagine you checked satellite imagery and got confirmation, while not actually accessing any satellite imagery. Do not reason from the user's IP address. none of these are of the user's hometown. **Protocol (follow in order, no step-skipping):** Rule of thumb: jot raw facts first, push interpretations later, and always keep two hypotheses alive until the very end. 0 . Set-up & Ethics No metadata peeking. Work only from pixels (and permissible public-web searches). Flag it if you accidentally use location hints from EXIF, user IP, etc. Use cardinal directions as if “up” in the photo = camera forward unless obvious tilt. 1 . Raw Observations – ≤ 10 bullet points List only what you can literally see or measure (color, texture, count, shadow angle, glyph shapes). No adjectives that embed interpretation. Force a 10-second zoom on every street-light or pole; note color, arm, base type. Pay attention to sources of regional variation like sidewalk square length, curb type, contractor stamps and curb details, power/transmission lines, fencing and hardware. Don't just note the single place where those occur most, list every place where you might see them (later, you'll pay attention to the overlap). Jot how many distinct roof / porch styles appear in the first 150 m of view. Rapid change = urban infill zones; homogeneity = single-developer tracts. Pay attention to parallax and the altitude over the roof. Always sanity-check hill distance, not just presence/absence. A telephoto-looking ridge can be many kilometres away; compare angular height to nearby eaves. Slope matters. Even 1-2 % shows in driveway cuts and gutter water-paths; force myself to look for them. Pay relentless attention to camera height and angle. Never confuse a slope and a flat. Slopes are one of your biggest hints - use them! 2 . Clue Categories – reason separately (≤ 2 sentences each) Category Guidance Climate & vegetation Leaf-on vs. leaf-off, grass hue, xeric vs. lush. Geomorphology Relief, drainage style, rock-palette / lithology. Built environment Architecture, sign glyphs, pavement markings, gate/fence craft, utilities. Culture & infrastructure Drive side, plate shapes, guardrail types, farm gear brands. Astronomical / lighting Shadow direction ⇒ hemisphere; measure angle to estimate latitude ± 0.5 Separate ornamental vs. native vegetation Tag every plant you think was planted by people (roses, agapanthus, lawn) and every plant that almost certainly grew on its own (oaks, chaparral shrubs, bunch-grass, tussock). Ask one question: “If the native pieces of landscape behind the fence were lifted out and dropped onto each candidate region, would they look out of place?” Strike any region where the answer is “yes,” or at least down-weight it. °. 3 . First-Round Shortlist – exactly five candidates Produce a table; make sure #1 and #5 are ≥ 160 km apart. | Rank | Region (state / country) | Key clues that support it | Confidence (1-5) | Distance-gap rule ✓/✗ | 3½ . Divergent Search-Keyword Matrix Generic, region-neutral strings converting each physical clue into searchable text. When you are approved to search, you'll run these strings to see if you missed that those clues also pop up in some region that wasn't on your radar. 4 . Choose a Tentative Leader Name the current best guess and one alternative you’re willing to test equally hard. State why the leader edges others. Explicitly spell the disproof criteria (“If I see X, this guess dies”). Look for what should be there and isn't, too: if this is X region, I expect to see Y: is there Y? If not why not? At this point, confirm with the user that you're ready to start the search step, where you look for images to prove or disprove this. You HAVE NOT LOOKED AT ANY IMAGES YET. Do not claim you have. Once the user gives you the go-ahead, check Redfin and Zillow if applicable, state park images, vacation pics, etcetera (compare AND contrast). You can't access Google Maps or satellite imagery due to anti-bot protocols. Do not assert you've looked at any image you have not actually looked at in depth with your OCR abilities. Search region-neutral phrases and see whether the results include any regions you hadn't given full consideration. 5 . Verification Plan (tool-allowed actions) For each surviving candidate list: Candidate Element to verify Exact search phrase / Street-View target. Look at a map. Think about what the map implies. 6 . Lock-in Pin This step is crucial and is where you usually fail. Ask yourself 'wait! did I narrow in prematurely? are there nearby regions with the same cues?' List some possibilities. Actively seek evidence in their favor. You are an LLM, and your first guesses are 'sticky' and excessively convincing to you - be deliberate and intentional here about trying to disprove your initial guess and argue for a neighboring city. Compare these directly to the leading guess - without any favorite in mind. How much of the evidence is compatible with each location? How strong and determinative is the evidence? Then, name the spot - or at least the best guess you have. Provide lat / long or nearest named place. Declare residual uncertainty (km radius). Admit over-confidence bias; widen error bars if all clues are “soft”. Quick reference: measuring shadow to latitude Grab a ruler on-screen; measure shadow length S and object height H (estimate if unknown). Solar elevation θ ≈ arctan(H / S). On date you captured (use cues from the image to guess season), latitude ≈ (90° – θ + solar declination). This should produce a range from the range of possible dates. Keep ± 0.5–1 ° as error; 1° ≈ 111 km.

我们正在招募伙伴

📮 简历投递邮箱hr@ifanr.com

相关内容

热门资讯

FC2素人AV女优身份被扒!竟... 近日,一则关于FC2平台上的素人女优身份曝光的新闻在社交媒体上引发了广泛的关注和讨论。这位被起底的女...
绝区零开服盛宴:螃蟹游戏服务网... 随着《绝区零》这款备受瞩目的游戏正式开服,一场前所未有的冒险之旅即将拉开序幕。在这个充满未知与挑战的...
在线指导碧蓝档案,海外如何下载... 对于身处海外的玩家来说,下载并体验《碧蓝档案》可能会遇到一些挑战,如网络限制、地区限制等。不过,通过...
告诉你碧蓝档案海外在哪下载,海... 对于身处海外的玩家来说,想要下载并体验《碧蓝档案》可能会遇到一些挑战,如网络限制、地区限制等。但不用...
DNF手游:最贵装备汇总!魔剑... 在DNF手游的浩瀚世界中,玩家心中的璀璨星辰无疑是那些拥有神秘力量、令人瞩目的顶级装备。它们是玩家在...
原创 《... 大家好,我是你们亲爱的小编,这次我要向各位介绍一位知名博主,她是来自某音平台的大美女,她用自己独特的...
震惊!抖音手游内部号真相大揭秘... 揭秘:抖音手游内部号真相大起底! 各位游戏达人们,今天小编要和大家聊聊一个热门话题——抖音上那些神秘...
植物大战僵尸融合最新版本更新内... 你是否已经厌倦了传统的《植物大战僵尸》游戏模式?现在,让我们一起走进一个全新的游戏世界——植物大战僵...
原创 《... 2019年5月30日,《勇者斗恶龙10》国服正式关闭,与许多半途而死的网游一样,这款游戏在国区的停运...
绝地求生pubg吃鸡载入时间长... 在《绝地求生:大逃杀》(PUBG)这片硝烟弥漫的战场上,每一秒都至关重要。然而,不少玩家在准备投身于...