• Future AI Lab
  • Posts
  • 😱OpenAI New GPTBot Scrapes Your Website - Unless you Opt Out

😱OpenAI New GPTBot Scrapes Your Website - Unless you Opt Out

AI is getting scary good at video & voice

Your Weekly Dose of AI is Served

Overwhelmed by AI News & Tools?🥴

Relax, We Only You Serve Relevant Bites + Our AI & Automation Tutorials are a Piece of Cake! 🍰

In today’s Future AI Lab Menu:

  • 🍪 Freshly Baked AI Updates - OpenAI GPTBot Scrapes Your Website

  • 🍟 AI Snacks - Zoom’s Controversial Video Data Usage

  • 🤖 Baking With Bots (Tutorials) - Find out how AI tools use your data

  • 🛠️ 3 Taste Test Tools

Read time: 4 minutes

🍪 FRESHLY BAKED - AI UPDATES📰

OpenAI’s New GPTBot Scrapes Your Site

ChatGPT’s OpenAI launched a bot that scrapes your website for content to train its AI unless you opt out. ChatGPT is only trained up until 2021 data.

Platforms like X (Twitter), website owners & news sites have been sharing ways to block the bot. (Tutorial below in “Baking for Bots”).

Why would you want to block it?

  • ChatGPT may not give credit or link back to your site

  • You’re not paid for your copyright content like you would if someone visited your site (google ads, affiliate links, product sales etc)

  • Data that you or your members create is not exclusive to your platform

Why would you want it to scrape your website?

  • Depending on how your content is used, it may be beneficial if your site link is referenced in ChatGPT answers

  • You have a branded product/personal brand. Any mention of your name would bring more awareness (Free marketing)

Further insights:

  • Google’s C4 and CommonCrawl have been scraping websites previously for ChatGPT, Meta LLMs & Stable Diffusion. Once scraped it saves “forever“

  • One of our members Mr Abubakr pointed out that the “blocking of web scrapers” can still be bypassed via fake user agents. There’s no stopping other companies doing this in future so take this as a temporary solution.

Learn A.I in 5 minutes…

In other news, let’s be real for a second.

A.I is confusing.

That’s why we read Synthetic Mind.

It’s like the Future AI Lab, but mixed with business.

Written by 2 founders with over $40m in revenue, they give a unique perspective in their newsletters instead of just boring news.

Synthetic Mind is giving away a free guide on how to make oodles & noodles of money and which tools are actually worth using.

Now Harder To Spot AI Videos

This new update from HeyGen’s founder has our minds blown! I honestly couldn’t tell which was AI generated. Before you view the video here, which one is AI?

Which video is AI?

Login or Subscribe to participate in polls.

Why it is better that previous versions:

  • Moves & looks more natural (compared to previous versions)

  • Voice clone (normally have to pair video apps with another voice app like ElevenLabs)

  1. 📝 Prompt examples - appear at bottom of chat to help you start

  2. 💌 Suggested replies - Continue the conversation with 1 click

  3. 💡 GPT-4 by default - Remembers which model you picked before

  4. 🗂️ Upload multiple files: Code Interpreter can now analyze data across multiple files

  5. ⌚️ Stay logged in - No longer logged out every 2 weeks

  6. ⌨️ Keyboard shortcuts - Try ⌘ (Ctrl) + / to see the complete list

🍿AI Snacks

🍭 Zoom revised terms - Controversy using videos meetings for AI training

🍩 Bing Chat coming to Safari and Chrome

🍰 AI is dangerously good at giving eating disorder advice

🍧 Buffett’s Whopping 47% of his $375 billion portfolio in 3 AI stocks

🧋Microsoft Azure AI Text to Speech supports Multilingual Voices

🧑‍🍳BAKING WITH BOTS🤖

Block GPTBot From Crawling Your Site

To block GPTBot from accessing your site content you can copy/paste the 2 lines of code below to your website's robots.txt

User-agent: GPTBot
Disallow: /

Prompt to Try

If you are worried about AI companies using your personal data in their AI training, here’s a quick way to check how they are using it.

Use the below prompt in Bard, Bing or Perplexity.ai (or any Chatbot that allows URL links).(Not legal advice, always check with your lawyer to confirm).

Act as a lawyer specializing in privacy and data protection law. 

Analyze the website at this URL: [insert URL here] 

From a legal perspective, focusing on user privacy and data practices, please respond in clear, simple, concise bullet points without technical jargon explaining:

1. What kinds of user data are collected by the site, and how is this data used? 
2. What else should users know about how their privacy, data or ethical concerns?

🍴TASTE TEST TOOLS⚒️

  1. Google’s NotebookLM - Free virtual research assistant to help with your docs (Waitlist for US only)

  2. Tally - user-friendly platform for creating customizable forms and surveys with ease

  3. Desktopus AI - simplifies presentation creation by offering pre-designed templates and automated content suggestions.

🗞️ UNTIL NEXT BYTE!

🚀 How I can help you

1. Train ChatGPT to be your Brand Marketer! Grab it here

Generate UNLIMITED supply of content ideas, craft your unique brand voice, create a marketing strategy and generate captivating video hooks/scripts tailored to your industry - All using my custom Plug and Play Chatgpt Prompts.

2. Advertise with us (13,000+ Email, 360k+ Social media) Book here!

Which Part Was Your Favourite Today?

Help us help you! Your feedback shapes future editions 😜

Login or Subscribe to participate in polls.

Favourite Review of the Week

Disclaimer: The content provided herein is intended for entertainment and informational purposes only. It is not intended as, nor should it be construed as financial, legal, tax, investment, or other professional advice. Some of the links in this article may be affiliate links, which can provide compensation to me at no cost to you if you decide to purchase a paid plan. You should consult a professional advisor before making any decisions or taking any actions based on the information provided.