- Future AI Lab
- Posts
- 😱OpenAI New GPTBot Scrapes Your Website - Unless you Opt Out
😱OpenAI New GPTBot Scrapes Your Website - Unless you Opt Out
AI is getting scary good at video & voice
TikTok | Tools | Advertise | Unsubscribe
Your Weekly Dose of AI is Served
Overwhelmed by AI News & Tools?🥴
Relax, We Only You Serve Relevant Bites + Our AI & Automation Tutorials are a Piece of Cake! 🍰
In today’s Future AI Lab Menu:
🍪 Freshly Baked AI Updates - OpenAI GPTBot Scrapes Your Website
🍟 AI Snacks - Zoom’s Controversial Video Data Usage
🤖 Baking With Bots (Tutorials) - Find out how AI tools use your data
🛠️ 3 Taste Test Tools
Read time: 4 minutes
🍪 FRESHLY BAKED - AI UPDATES📰
OpenAI’s New GPTBot Scrapes Your Site
ChatGPT’s OpenAI launched a bot that scrapes your website for content to train its AI unless you opt out. ChatGPT is only trained up until 2021 data.
Platforms like X (Twitter), website owners & news sites have been sharing ways to block the bot. (Tutorial below in “Baking for Bots”).
Why would you want to block it?
ChatGPT may not give credit or link back to your site
You’re not paid for your copyright content like you would if someone visited your site (google ads, affiliate links, product sales etc)
Data that you or your members create is not exclusive to your platform
Why would you want it to scrape your website?
Depending on how your content is used, it may be beneficial if your site link is referenced in ChatGPT answers
You have a branded product/personal brand. Any mention of your name would bring more awareness (Free marketing)
Further insights:
Google’s C4 and CommonCrawl have been scraping websites previously for ChatGPT, Meta LLMs & Stable Diffusion. Once scraped it saves “forever“
One of our members Mr Abubakr pointed out that the “blocking of web scrapers” can still be bypassed via fake user agents. There’s no stopping other companies doing this in future so take this as a temporary solution.
Learn A.I in 5 minutes…
In other news, let’s be real for a second.
A.I is confusing. That’s why we read Synthetic Mind. It’s like the Future AI Lab, but mixed with business. Written by 2 founders with over $40m in revenue, they give a unique perspective in their newsletters instead of just boring news. |
Synthetic Mind is giving away a free guide on how to make oodles & noodles of money and which tools are actually worth using.
Now Harder To Spot AI Videos
This new update from HeyGen’s founder has our minds blown! I honestly couldn’t tell which was AI generated. Before you view the video here, which one is AI?
Which video is AI? |
Why it is better that previous versions:
Moves & looks more natural (compared to previous versions)
Voice clone (normally have to pair video apps with another voice app like ElevenLabs)
Join Waitlist to try
📝 Prompt examples - appear at bottom of chat to help you start
💌 Suggested replies - Continue the conversation with 1 click
💡 GPT-4 by default - Remembers which model you picked before
🗂️ Upload multiple files: Code Interpreter can now analyze data across multiple files
⌚️ Stay logged in - No longer logged out every 2 weeks
⌨️ Keyboard shortcuts - Try ⌘ (Ctrl) + / to see the complete list
🍿AI Snacks
🍭 Zoom revised terms - Controversy using videos meetings for AI training
🍩 Bing Chat coming to Safari and Chrome
🍰 AI is dangerously good at giving eating disorder advice
🍧 Buffett’s Whopping 47% of his $375 billion portfolio in 3 AI stocks
🧋Microsoft Azure AI Text to Speech supports Multilingual Voices
🧑🍳BAKING WITH BOTS🤖
Block GPTBot From Crawling Your Site
To block GPTBot from accessing your site content you can copy/paste the 2 lines of code below to your website's robots.txt
User-agent: GPTBot
Disallow: /
Prompt to Try
If you are worried about AI companies using your personal data in their AI training, here’s a quick way to check how they are using it.
Use the below prompt in Bard, Bing or Perplexity.ai (or any Chatbot that allows URL links).(Not legal advice, always check with your lawyer to confirm).
Act as a lawyer specializing in privacy and data protection law.
Analyze the website at this URL: [insert URL here]
From a legal perspective, focusing on user privacy and data practices, please respond in clear, simple, concise bullet points without technical jargon explaining:
1. What kinds of user data are collected by the site, and how is this data used?
2. What else should users know about how their privacy, data or ethical concerns?
🍴TASTE TEST TOOLS⚒️
Google’s NotebookLM - Free virtual research assistant to help with your docs (Waitlist for US only)
Tally - user-friendly platform for creating customizable forms and surveys with ease
Desktopus AI - simplifies presentation creation by offering pre-designed templates and automated content suggestions.
🗞️ UNTIL NEXT BYTE!
🚀 How I can help you
1. Train ChatGPT to be your Brand Marketer! Grab it here
Generate UNLIMITED supply of content ideas, craft your unique brand voice, create a marketing strategy and generate captivating video hooks/scripts tailored to your industry - All using my custom Plug and Play Chatgpt Prompts.
2. Advertise with us (13,000+ Email, 360k+ Social media) Book here!
3. Free A.I. & Automation Tutorials and List of 1000+ A.I. Tools
Which Part Was Your Favourite Today?Help us help you! Your feedback shapes future editions 😜 |