Wikipedia talk:WikiProject Punctuation
- Old talk is archived at subpages: Round 1
Round 2
[edit]There is a new dump out, so I am going to try to get Round 2 going. The new dump is in XML format, however. Does anybody know how I can import the XML dump of just the cur table into mysql without installing mediawiki? — brighterorange (talk) 02:14, 7 September 2005 (UTC)
- Will Navicat work? --Viriditas | Talk 07:52, 16 September 2005 (UTC)
- I don't know, but I already got it to work (after a long ordeal) using the mediawiki tool called "mwdumper". Thanks for the suggestion, though! — brighterorange (talk) 14:12, 16 September 2005 (UTC)
- Will Navicat work? --Viriditas | Talk 07:52, 16 September 2005 (UTC)
- (just a comment)... 16th! one day over deadline of 15th! just kidding. -- WB 07:36, 16 September 2005 (UTC)
- I know.. ;) it took like ten times longer than I thought it would to import the database. It's up now, though. — brighterorange (talk) 14:12, 16 September 2005 (UTC)
- Seems like most of the entries found are now spotted because of lack of "list-izing," not because it lacked a period. lol. My work is mosltly "list-izing" than "adding periods now" ha. -- WB 01:09, 18 September 2005 (UTC)
- I think this is because many of the articles with low IDs were inserted en masse by computer, and contain lots of badly-formatted lists of facts. That phenomenon seems to die out with later dumps, although that may just be wishful thinking. Anyway, wikifying lists is still a valuable cleanup task, and will prevent them from showing up in the future. Thanks! — brighterorange (talk) 13:11, 18 September 2005 (UTC)
- You're welcome. I'm glad I'm doing this. There seems to be very badly done articles with User comments... Some are just sad to edit, but I can't really do anything about "List of rulers in some unknown country I never heard of"... There should be something like "WikiProject: Get rid of old pages that doesn't make any sense" lol -- WB 05:14, 20 September 2005 (UTC)
- I think this is also because when we went through all the articles the first time, many of the participants just left out the ones that were not "period++" Hopefully, when we go through this time, we correct them so they don't get caught in the next scan. -- WB 03:43, 21 September 2005 (UTC)
- I think this is because many of the articles with low IDs were inserted en masse by computer, and contain lots of badly-formatted lists of facts. That phenomenon seems to die out with later dumps, although that may just be wishful thinking. Anyway, wikifying lists is still a valuable cleanup task, and will prevent them from showing up in the future. Thanks! — brighterorange (talk) 13:11, 18 September 2005 (UTC)
UTF-8 encoding problems
[edit]For some reason, Periodbot does not insert a DOCTYPE declaration, so some characters are garbled up. It should check the wiki to see which encoding it is using and then use a DOCTYPE corresponding to that encoding. (I think all wikis are now UTF-8 encoded). Andrew pmk | Talk 21:12, 26 September 2005 (UTC)
- Will that fix it? I noticed this problem too (although it seems to work okay for links at least on Windows), and figured it was a result of the new xml dump format, which the wikitech folks claim "may have character set issues." I'll try the doctype thing; I can insert them manually in the dumps if it helps. — brighterorange (talk) 21:26, 26 September 2005 (UTC)
- Thanks, adding the "meta http-equiv" tag does seem to do it. It'll definitely be in there for the next run, and I'll see if I can batch-insert it in the exiting dumps, since it's kind of a serious issue on linux and maybe other platforms. — brighterorange (talk) 21:31, 26 September 2005 (UTC)
- I added the tag to all dumps. Let me know if you still have charset problems. — brighterorange (talk) 21:24, 27 September 2005 (UTC)
Latest dumps
[edit]Just moaning--the dump file items that I'm working through are almost all in need of more significant work than just inserting periods, so it's going to take a while to work through one set (for me anyway). Copyvio tags needed, probable insignificant article tags needed, cleanup tags needed, titles misspelled or mislabeled, stubs without appropriate stub labels... I'm hardly inserting any periods at all! Argh. Elf | Talk 00:30, 1 October 2005 (UTC)
- Yes, I can't guarantee that all the articles you see will be good except for missing periods; just that they will have missing periods as the least of their problems. ;) But fixing such errorful articles is at least as valuable as fixing punctuation, even if you are just inserting {{cleanup-date|October 2005}} and letting others deal with it... — brighterorange (talk) 02:24, 1 October 2005 (UTC)
Well, yup, that's what I've been doing in most cases. ...Oh, yeah, according to WP it's October now! (writing at 9:20 pm Sept 30...) Elf | Talk 04:25, 1 October 2005 (UTC)
- Don't forget to add speedy delete candidates to that list... lol! --Celestianpower hablamé 14:11, 1 October 2005 (UTC)
new idea?
[edit]Is it possible to detect unnecessary spaces? Every single day, I find pages that have two or three spaces on the top or some other space because they thought it would be necessary. Nothing urgent, but annoying (at least to me) For example:
Wikipedia is an encyclopedia.
|
instead of:
Wikipedia is an encyclopedia.
It is an enclyclopedia. |
I can explain a bit more if I'm vague on this one. -- WB 03:42, 10 October 2005 (UTC)
- Yes, this would actually be a lot easier than detecting missing periods. I see this a lot, too. Do you think it's worth searching for? I expect we would get lots of hits. — brighterorange (talk) 14:15, 10 October 2005 (UTC)
- I think we can ask a bot to do this task though. There aren't many reasons why there should be two or more spaces... It does improve Wikipedia though. Think of a book that has random spaces between paragraphs. We wouldn't want that. Anyway, my thoughts. -- WB 17:24, 10 October 2005 (UTC)
Observation
[edit]I have found, and I don't know if anyone else has, that the periodbot frequently picks up on alternate spellings, pronunciations, and synonyms as incomplete sentences. Almost 20% of my previous data file consisted of them.--Adun
- Often, many of the same type of false positive are clustered together, perhaps because all of those articles were added en masse and so they are near each other in the database. Can you elaborate on the pattern you saw? It may be pretty easy to filter out. I don't think I've ever seen it before. Brighterorange 15:40, 16 December 2005 (UTC)
- Sure thing. The part it would have in the dump would simply be the part (that I assume was at the top) Where it would say "Alternate: Moor, Mour" (I'm, just making this up). I think the PB picked it up because it didn't have a period at the end, which it doesn't have to.--Adun
Advice on "fixing" lists
[edit]Obviously we've all had lists come up in our dumps, but there doesn't seem to be anything on the project page describing how to deal with them (unless I need glasses!). Personally I just bullet point them (with *), mainly for style reasons - are there any other ways of dealing with lists that are formatted with overuse of the enter button? --Lox (t,c) 20:26, 12 January 2006 (UTC)
- Hey, I've noticed that sometimes album tracklistings are being changed and having a period added to the end of the list (eg Auf der Maur). I personally feel that this shouldn't happen - they're lists of titles that are named and punctuated as artist intent - what do you guys think? Satan's Rubber Duck 08:12, 18 March 2006 (UTC)
- When I edited that, I had thought that the writer accidentally left out the period. I suppose I'm wrong in assuming that? NapoleonB 01:45, 30 March 2006 (UTC)
- Nothing that can't be fixed :) It's probably not wrong grammatically, but tracklists seem to have their own styles. Satan's Rubber Duck 11:27, 30 March 2006 (UTC)
- Good point. I'll be more careful editing track names in the future. :D NapoleonB 16:40, 30 March 2006 (UTC)
- I think bullet-pointing the lists is a good idea, and it will prevent them from being identified by Periodbot in the future. But don't get stressed out over things that are not explicitly part of this project if you don't want! — brighterorange (talk) 18:53, 30 March 2006 (UTC)
Project pages
[edit]Could project pages be omitted from the dumps? File #285 has quite a few project pages such as Wikipedia:Naming conventions (Slovenian vs Slovene)/Archive 1. I'm guessing those don't need fixing. Gimboid13 22:15, 4 February 2006 (UTC)
- That's really weird; anything from the Wikipedia namespace shouldn't be considered at all, since we're only looking at the article namespace. It's most likely a problem with the database dumps (?). — brighterorange (talk) 14:00, 18 April 2006 (UTC)
Checking grammar
[edit]I know that parsing English, or any natural language is hard, but there are a few simple grammar checks that can be done. Collecting information on common mistakes is also useful for writers of checkers (mine are here).
One common mistake is the same word to appear twice twice. This is not always a bug, but it is often unintended.
If anybody is interested in running a grammar checker over the English Wikipedia articles there are some links to useful tools and data here.
Random Thought
[edit]Just found this project and am very glad to be able to help out on wikipedia without knowing a ton about some random area of knowledge! As I've just gotten really into those userboxes, I think it would be fun if someone with more know how than I could whip up one of those "this user participates in the punctuation wikiproject."--Lowfatsourcreme 17:36, 3 April 2006 (UTC)
I was just thinking that!!
Reedy Boy 06:39, 18 April 2006 (UTC)
- I created one. You can add {{User project punctuation}} to your userpage, which will produce:
• | This user participates in Project Punctuation. |
Enjoy! — brighterorange (talk) 14:16, 18 April 2006 (UTC)
Yay, Boxes
Maybe an option to put the amount done.... Or maybe not. lol
Thanks!
Reedy Boy 16:24, 18 April 2006 (UTC)
Taking over Dump Files Started by Other People
[edit]Are we allowed to do this?
As i've got few done tonight, and i noticed some are from February, and they really need completing
IF we are allowed, can i just complete it and then delete it?
Reedy Boy 19:25, 10 April 2006 (UTC)
- If it's more than a week or so old, go for it! We're almost done! — brighterorange (talk) 14:02, 18 April 2006 (UTC)
And we're done!!!
[edit]Yay!!
Reedy Boy 07:36, 6 May 2006 (UTC)
- I really hope that the use of 'were' is a joke on your part, seeing as this is the 'Project Punctuation' page. Berry 11:34, 6 May 2006 (UTC)
- Yup
LOL
Reedy Boy 20:54, 6 May 2006 (UTC)
- Well done, everyone! We'll take a break for a few months. I think that perhaps the next round will be a different (punctuation) analysis. Maybe the proper use of en dashes and em dashes? I'd also like to make the process somewhat more automated through the use of client-side scripting. If anyone wants to help out on the development side of this project (and has some expertise), let me know! — brighterorange (talk) 14:47, 9 May 2006 (UTC)
- It'd would be very good if you could get the server to reduce the amount of items created, such as ." not being included and so on...?
What is it written in? Reedy Boy 06:59, 10 May 2006 (UTC)
- The analysis code is in Standard ML. It does already filter out punctuation at the end of a quotation, though some things do confuse it. Do you have a specific rule in mind? — brighterorange (talk) 14:06, 10 May 2006 (UTC)
Project directory
[edit]Hello. The WikiProject Council has recently updated the Wikipedia:WikiProject Council/Directory. This new directory includes a variety of categories and subcategories which will, with luck, potentially draw new members to the projects who are interested in those specific subjects. Please review the directory and make any changes to the entries for your project that you see fit. There is also a directory of portals, at User:B2T2/Portal, listing all the existing portals. Feel free to add any of them to the portals or comments section of your entries in the directory. The three columns regarding assessment, peer review, and collaboration are included in the directory for both the use of the projects themselves and for that of others. Having such departments will allow a project to more quickly and easily identify its most important articles and its articles in greatest need of improvement. If you have not already done so, please consider whether your project would benefit from having departments which deal in these matters. It is my hope that all the changes to the directory can be finished by the first of next month. Please feel free to make any changes you see fit to the entries for your project before then. If you should have any questions regarding this matter, please do not hesitate to contact me. Thank you. B2T2 14:15, 26 October 2006 (UTC)
Wikipedia Day Awards
[edit]Hello, all. It was initially my hope to try to have this done as part of Esperanza's proposal for an appreciation week to end on Wikipedia Day, January 15. However, several people have once again proposed the entirety of Esperanza for deletion, so that might not work. It was the intention of the Appreciation Week proposal to set aside a given time when the various individuals who have made significant, valuable contributions to the encyclopedia would be recognized and honored. I believe that, with some effort, this could still be done. My proposal is to, with luck, try to organize the various WikiProjects and other entities of wikipedia to take part in a larger celebrartion of its contributors to take place in January, probably beginning January 15, 2007. I have created yet another new subpage for myself (a weakness of mine, I'm afraid) at User talk:Badbilltucker/Appreciation Week where I would greatly appreciate any indications from the members of this project as to whether and how they might be willing and/or able to assist in recognizing the contributions of our editors. Thank you for your attention. Badbilltucker 19:28, 30 December 2006 (UTC)
Come back project punctuation!
[edit]When will the next dump be out? I liked helping out with this project! J. Finkelstein 06:39, 25 April 2007 (UTC)
- Well, I don't have any immediate plans to run another round (unfortunately the size of the database dumps makes it rather a large effort for me and harder each time), but I have been working on ideas for the next iteration. Particularly, I've been writing a client-side script that automatically corrects punctuation errors for any page. You can take a look at User:Brighterorange/punctuation.js and User:Brighterorange/punctuationtest if you're interested in what I've done so far. I've been using it, but I'm not sure it's ready for others yet. — brighterorange (talk) 00:35, 26 April 2007 (UTC)
Place names
[edit]I just had someone rename an article removing the full stop from St. George. Please tell me this is NOT a Wikipedia standard. I see nothing in the Manual of Style regarding this, and there must be plenty of people who are as irritated as I am by people removing full stops and apostrophes from places named after people. Mdw0 (talk) 02:32, 19 February 2009 (UTC)
Debate over hyphens vs en-dashes at Wikipedia talk:Manual of Style
[edit]Hopefully somebody here can provide some insight on the debate going on regarding hyphenation of "Ural-Altaic languages" over at Wikipedia talk:Manual of Style#en dashes vs hyphens. --Wulf (talk) 22:13, 15 March 2009 (UTC)
- Debate has moved to Talk:Ural-Altaic languages#Requested move. —Wulf (talk) 20:20, 18 June 2009 (UTC)
New dashes tool
[edit]I've created a new tool for fixing common hyphens/dashes/minus signs mistakes. It does a better job than the other tools I'm aware of, rarely missing needed changes or making incorrect changes. —GregU (talk) 07:10, 4 November 2009 (UTC)
Dumps
[edit]Are these required any more? Or has the original point of this project largely been usurped by tools such as AWB? TheGrappler (talk) 23:10, 24 August 2010 (UTC)
Is there a member of this Project who can help me with a question I have about MOS:LQ? I really want to understand it. It is the first time I've come to this Project, I'm assuming there are punctuation experts here who understand MOS:LQ well and who care! (I'll post specifics after hearing from a volunteer. Thank you!) Ihardlythinkso (talk) 04:47, 8 October 2011 (UTC)
Comment on the WikiProject X proposal
[edit]Hello there! As you may already know, most WikiProjects here on Wikipedia struggle to stay active after they've been founded. I believe there is a lot of potential for WikiProjects to facilitate collaboration across subject areas, so I have submitted a grant proposal with the Wikimedia Foundation for the "WikiProject X" project. WikiProject X will study what makes WikiProjects succeed in retaining editors and then design a prototype WikiProject system that will recruit contributors to WikiProjects and help them run effectively. Please review the proposal here and leave feedback. If you have any questions, you can ask on the proposal page or leave a message on my talk page. Thank you for your time! (Also, sorry about the posting mistake earlier. If someone already moved my message to the talk page, feel free to remove this posting.) Harej (talk) 22:47, 1 October 2014 (UTC)
WikiProject X is live!
[edit]Hello everyone!
You may have received a message from me earlier asking you to comment on my WikiProject X proposal. The good news is that WikiProject X is now live! In our first phase, we are focusing on research. At this time, we are looking for people to share their experiences with WikiProjects: good, bad, or neutral. We are also looking for WikiProjects that may be interested in trying out new tools and layouts that will make participating easier and projects easier to maintain. If you or your WikiProject are interested, check us out! Note that this is an opt-in program; no WikiProject will be required to change anything against its wishes. Please let me know if you have any questions. Thank you!
Note: To receive additional notifications about WikiProject X on this talk page, please add this page to Wikipedia:WikiProject X/Newsletter. Otherwise, this will be the last notification sent about WikiProject X.