Jump to content

Wikipedia:Request a query

From Wikipedia, the free encyclopedia
(Redirected from Wikipedia:SQL requests)

This is a page for requesting one-off database queries for certain criteria. Users who are interested and able to perform SQL queries on the projects can provide results from the Quarry website.

You may also be interested in the following:

  • If you are interested in writing SQL queries or helping out here, visit our tips page.
  • If you need to obtain a list of article titles that meet certain criteria, consider using PetScan (user manual) or the default search. Petscan can generate list of articles in subcategories, articles which transclude some template, etc.
  • If you need to make changes to a number of articles based on a particular query, you can post to the bot requests page, depending on how many changes are needed.
  • For long-term review and checking, database reports are available.

Quarry does not have access to page content, so queries which require checking wikitext cannot be answered with Quarry. However, someone may be able to assist by using Quarry in another way (e.g. checking the table of category links rather than the "Category:" text) or suggest an alternative tool.

Find duplicated file licensing templates

[edit]

Hello, I'd like a query for every file that use any of these templates twice (the same template twice, not one usage of one template and one usage of another). For example File:Madonna Frozen Sickick.png should be one of the results. Thanks! Jonteemil (talk) 13:47, 27 July 2024 (UTC)[reply]

Not possible with a query: the table showing transclusions doesn't have duplicates (it's the same one used to generate Special:Whatlinkshere), and the replicas don't have the page text. Best I can think of is using search one template at a time, like so. —Cryptic 18:58, 27 July 2024 (UTC)[reply]
Okay, thank you. Jonteemil (talk) 15:04, 7 August 2024 (UTC)[reply]

Composition categories without templates

[edit]

Hi all, if possible, I'd like to request a list of any subcategories in Category:Compositions by composer (which have more than 2 entries) in which there is no corresponding navigational box created for said composer's works.

So essentially, I'd be looking for categories like Category:Compositions by Example Person which have more than 2 pages, but there is no corresponding {{Example Person}} that exists. I'd then go through the list and create the missing templates myself.

Thanks – Aza24 (talk) 22:35, 28 July 2024 (UTC)[reply]

quarry:query/85143. —Cryptic 23:24, 28 July 2024 (UTC)[reply]
At least one false negative that I noticed by accident: Category:Compositions by Dmitry Bortniansky isn't on the list despite Template:Dmitry Bortniansky not existing, since it only has one member, Category:Operas by Dmitry Bortniansky, even though that category has two members. Trivial to rerun the query without the minimum-number-of-members constraint, while still showing a count of members (and a list of the members' namespaces) if you want. —Cryptic 23:31, 28 July 2024 (UTC)[reply]
Thank you so much @Cryptic, that should work well. Question: will this link stick around or should I copy the results somewhere for my own use? Aza24 (talk) 00:24, 29 July 2024 (UTC)[reply]
It'll stay in place (and the results won't change even when the underlying data here does). It would probably be convenient to copy it into a sandbox page, though, so you can click on the links directly. The cyan "Download data" dropdown has an option for wikitable format. —Cryptic 00:29, 29 July 2024 (UTC)[reply]
Yes, good point! Thanks again. Aza24 (talk) 04:28, 29 July 2024 (UTC)[reply]

Heavy uses of Template:IETF RFC

[edit]

I'm interesting in finding the articles that make heavy use of Template:IETF RFC. Would it be possible to rank the top-10 articles with the most calls ("transclusions"?) of that template, please? Thanks! fgnievinski (talk) 01:20, 15 August 2024 (UTC)[reply]

Not here, for the same reason as #Find duplicated file licensing templates above. —Cryptic 01:30, 15 August 2024 (UTC)[reply]
It might be feasible to do it manually, though - there's only 817 articles that transclude it. You could scrape the wikitext for all of them, parse them for transclusions, and count those. There's only a couple redirects to confuse things, though if there's any indirect transclusions (i.e., through a different template), that'd make things harder. —Cryptic 01:35, 15 August 2024 (UTC)[reply]
Understood, thank you. I've asked around:
fgnievinski (talk) 02:35, 15 August 2024 (UTC)[reply]
Both of those projects are almost certainly pulling their data from the same place, hence with the same limitations. —Cryptic 03:01, 15 August 2024 (UTC)[reply]
Thanks for the heads-up. I asked because I'm not fluent in Wiki scrapping. fgnievinski (talk) 03:22, 15 August 2024 (UTC)[reply]
As above, search can sort of do this. this looks for "{{", zero or more spaces, "ietf", any single character, then "rfc", case insensitive, with all of that occurring at least forty times in the same mainspace page (both redirects to {{IETF RFC}} with any transclusions at all match that pattern). It finds 13 articles. It could very well be missing some, though; for example, {{IETF RFC|1234}} with extra spaces in the middle would render normally but not be searchable by this. —Cryptic 03:34, 15 August 2024 (UTC)[reply]
Wow wow, that's awesome, many thanks!!! fgnievinski (talk) 04:17, 15 August 2024 (UTC)[reply]

AfD and article deletion statistics

[edit]

Hello lovely volunteers. I'm trying to calculate 1) The number of articles deleted in 2022, and 2) The percentage of article deletions in 2022 that were the result of an AfD discussion. I think existing queries do this but would appreciate it if someone could check my interpretation of the queries.

Here are my assumptions:

  1. My definition of an "article" is a mainspace page that is not a redirect. I don't mind including dab pages, pages with no links, etc.
  2. I assume that all articles deleted as a result of an AfD discussion have "Articles for deletion" in the log reason
  3. From quarry:query/78694 I see that in 2022, 109,583 mainspace pages were deleted in 2022
  4. From quarry:query/78460 I see that in 2022, 37,297 pages were deleted with "redirect overwrite" in the log reason
  5. From quarry:query/78460 I see that in 2022, 5380 pages were deleted with "Redirects for discussion" in the log reason
  6. From quarry:query/78460 I see that in 2022, 13,635 pages were deleted with "Articles for Deletion" in the log reason
  7. I assume that the number of mainspace redirects not accounted for in the above queries is negligible or too difficult to get (is this true?)

Therefore it appears that:

  • In 2022, 66,906 articles were deleted (109,583 - 37,297 - 5380), an average of 183 per day
  • In 2022, 13,635 articles were deleted via AfD, an average of 37 per day
  • In 2022, 20% of article deletions were the result of an AfD discussion.

Does this look right? Clayoquot (talk | contribs) 19:38, 26 August 2024 (UTC)[reply]

#3 and #4 are accurate; #5-7 aren't. And #3 isn't either if you'd count a move out of mainspace without leaving a redirect as deletion.
Query 78460 doesn't do quite what you think it does. It's not showing the total number of deletions whose logs mention A7, the total number that mention AFD, etc; it tries to assign each deletion to a single reason in a given order. So if, say, a page was deleted with comment "Wikipedia:Articles for deletion/Spacely Sprockets; also a WP:CSD#G11" it would be counted as G11 and not AFD.
And there's no way to find out if a page was a redirect or not when it was deleted, other than if the comment mentions an R-series criterion (which that query doesn't look for) or RFD, or if it's the automatic deletion during a page-move. You could conceivably look at the length of the most recent deleted revision, but there's lots of redirects with more bytes in them than lots of short articles. I'd have no confidence in any query of the public replicas that purported to accurately count the number of redirects deleted. But if you were trying for a SWAG, you can still do a lot better than this query - sum the automatic overwrites, plus the R-series speedies, plus the RFDs, plus some proportion of appropriate-looking G6s and G8s (G14s using its redirect clause aren't ever distinguishable from other G14s in my experience). —Cryptic 20:09, 26 August 2024 (UTC)[reply]

Active admins

[edit]

I'm trying to figure out how many of the admins listed at Wikipedia:Active admins are also making more than a thousand edits a month. See WT:RFA for why I'm interested in doing that. Clovermoss🍀 (talk) 22:28, 26 August 2024 (UTC)[reply]

56, give or take. That's not quite what you're asking; it's the number of current admins (including adminbots) with more than 1000 edits in the last 30 days. I didn't crossref the "active admins" page since they should be mostly the same, give or take recent desysops (Pppery would've just missed the list with 956 edits) or re-activations. Going back more than 30 days is much slower, roughly five or six minutes per month (including the first). —Cryptic 23:07, 26 August 2024 (UTC)[reply]
Okay, thanks. Clovermoss🍀 (talk) 23:08, 26 August 2024 (UTC)[reply]